Journal of 


Experimental Psychology 


ARTHUR W. MELTON, Editor 
Arr Force PERSONNEL AND TRAINING RESEARCH CENTER 
LACKLAND Arr Force BASE 
San Antonio, TEXAS 


CONSULTING EDITORS 


Judson S. Brown, State University of Iowa Lloyd G. Humphreys, AF Personnel and Train- 
Cletus J. Burke, Indiana University ing Research Center 
Robert M. Gagné, AF Personnel and Arthur L. Irion, Tulane University 

Training Research Center Donald B. Lindsley, University of California, 
W. R. Garner, The Johns Hopkins Uni- Los Angeles 

versity Neal E. Miller, Yale University 
Frank A. Geldard, University of Virginia Edwin B. Newman, Harvard University 
James J. Gibson, Cornell University Kenneth W. Spence, State University of Iowa 
Clarence H. Graham, Columbia University Benton J. Underwood, Northwestern University 
David A. Grant, University of Wisconsin Delos D. Wickens, Ohio State University 


Lorraine Bouthilet, Managing Editor 





CONTENTS 


Rate of Verbal Conditioning in Relation to Stimulus Variability : 
C. J. Burke, W. K. Estes, AND S. HELLYER 
The CS-UCS Interval in GSR Conditioning: G. MoeLier .. 


The Effect of Pre-experimental and Experimental Anxiety on Recall Efficiency : 


R. M. Merrity 
Reinforcement Schedules in Habit Reversal—A Confirmation : 


J. H. Grossiicut, J. F. Hatt, anp W. Scott 
A Test of Whether the “Nonrewarded” Animals Learned as Much as the “Rewarded” 
Animals in the California Latent Learning Study: J. H. KANNER 
A Note on the Ballard Reminiscence Phenomenon: H. Ammons Anp A. L. Irton 
Knowledge of Results in the Acquisition and Transfer of a Gunnery Skill: 
M. GoipsteIn AND C. H. RITTENHOUSE 
Rate Recovery in a Repetitive Motor Task as a Function of Successive Rest Periods: 
E. A. BrLopEau 
The Relationship of Convergence and Elevation Changes to Judgments of Size: 
T. G. Hermans 
A Note on the Aubert Phenomenon: C. I. SANDSTROM ..........cceccccccceecccecceces 


Some Effects of Problem Complexity upon Problem Solution Efficiency in Different Com- 
munication Nets: M. E. SHAW ........eseeeee0: Ree ee Ee aE oe Pe 


Context Effects and the Validity of Loudness Scales: W. R. Garner 





American Psychological Association 


Vol. 48 No. 3 September 1954 





JOURNAL OF EXPERIMENTAL PSYCHOLOGY 


The JOURNAL OF EXPERIMENTAL PsyCHOLOGyY is published monthly, 
two volumes per year, by the American Psychological Association, Inc. 
The subscription rate per volume is $7.50, or $15.00 annually. Single 
copies are $1.50. Subscriptions, orders, and business communications 
should be addressed to the American Psychological Association, Inc., 1333 
Sixteenth St. N.W., Washington 6, D. C. 


This JouRNAL publishes original experimental investigations which are 
intended to contribute toward the development of psychology as an experi- 
mental science. Studies with normal human subjects are favored over 
studies involving abnormal or animal subjects. Studies in applied experi- 
mental psychology or engineering psychology may be accepted if they have 
broad implications for experimental and theoretical psychology. Normally, 
articles of a length exceeding 15 printed pages cannot be accepted. How- 
ever, an integrated series of studies accomplished simultaneously (e.g., 
most doctoral dissertations) must be presented in a single article, rather than 
in a series of articles. 


Articles are published in the order of their receipt, except in rare circum- 
stances. Authors are supplied with 50 free offprints without covers. All 
of the cost of an author’s alterations in galley proof is charged to the author. 
Priority in publication is given to articles whose authors assume the full 
cost of publication. In 1954 the publication cost is $20 per printed page. 


Authors of priority publications receive no free offprints. 


Address all articles submitted for regular or priority publication to the 
editor, Arthur W. Melton, Air Force Personnel and Training Research 
Center, Lackland Air Force Base, San Antonio, Texas. 


Manuscripts submitted to the JOURNAL must adhere to the conven- 
tions concerning reference citations, preparation of tables and figures, 
manuscript format, etc. as described in the Publication Manual of the 
American Psychological Association (Psychol. Bull., 1952, 49 (4, Suppl.), 
389-449. In particular, the organization of the manuscript should be text, 
references, footnotes, tables (one to a page), figure titles (typed on separate 
page), and figures. All typed material must be double-spaced with ample 
margins. The term “subject,” with its various forms, is abbreviated S, 
Ss, S’s, and Ss’; analogous abbreviations are used for “experimenter” (Z) 
and “observer” (O). When in doubt about the practices of the JOURNAL, 
authors should examine a recent issue. 





Entered as second-class matter, February 6, 1937, at the post office at Lancaster, Pa., under the Act of 
March 3, 1879. Acceptance for mailing at the special rate of postage provided for in paragraph (d-2), 
Section 34.40, P. L. & R. of 1948, authorized September 11, 1947. 


Send address changes to 1333 Sixteenth St. N.W., Washington 6, D. C. Address changes must reach 
the Subscription Office by the 25th of the month to take effect the following month. Undelivered copies 
resulting from address changes will not bé replaced; subscribers should notify the post office that they will 


guarantee second-class forwarding postage. Other claims for undelivered copies must be made within four 
months of publication. 


Copyright, 1954, by the American Psychological Association, Inc. 








? 


Journal of 


Experimental Psychology 








VoL. 48, No. 3 


SEPTEMBER, 1954 








RATE OF VERBAL CONDITIONING IN RELATION TO 
STIMULUS VARIABILITY! 


C. J. BURKE, W. K. ESTES, AND S. HELLYER? 


Indiana University 


A principal assumption of the theory 
of stimulus variability in learning 
presented in a recent paper (1) is that 
the rate of learning a given response 
depends upon the rate at which the 
organism samples the population of 
stimuli to which the response becomes 
conditioned. The experiment to be 
described represents an attempt to 
obtain relatively direct evidence con- 
cerning the tenability of the assump- 
tion by testing a number of empirical 
predictions that are derivable from it 
in conjunction with the general prin- 
ciples of statistical learning theory. 
We shall first sketch informally the 
predictions and the proposed method 
of testing, then give formal derivations 
and details of procedure in separate 
sections. 

When the theory is interpreted in 
terms of a simple associative learning 
experiment, the stimulating situation 
is represented by a population, or set, 
of stimulus elements, and it is assumed 
that the organism is affected by a 
random sample of these elements on 
each trial. The ratio of mean sample 


1 This research was performed under a grant 
from the National Science Foundation. 

2 Now at National Defense Research Board, 
Ottawa, Canada. 


size to population size will be referred 
to hereafter as the sampling ratio and 
symbolized as 6. Since learning is 
presumed to occur with respect to a 
given stimulus element only if it is 
present in a trial sample, rate of 
learning should be related to size of the 
sampling ratio. On the other hand, 
after a large number of trials, mean 
response probability should be inde- 
pendent of the sampling ratio since 
this parameter expresses the prob- 
ability that an element will be sampled 
but has nothing to do with the relative 
probability of an element’s being 
conditioned to one or another re- 
sponse class once it is sampled. Fi- 
nally, one of the possible values of the 
theory lies in the fact that, once the 
values of the theoretical parameters 
have been determined for a given 
population of organisms, it should be 
possible to make exact quantitative 
predictions of behavior under modified 
conditions. 

The empirical tests of these pre- 
dictions were carried out in the same 
individualized modification of Hum- 
phreys’ (5) verbal conditioning situ- 
ation that has been utilized in studying 
other aspects of statistical association 
theory (2). In this situation two 


153 








154 


mutually exclusive events, E; and Es, 
are designated and determine two 
classes of trials, those which terminate 
with E, and those which terminate 
with E,. On each trial a signal is 
given, and in response to the signal, 
S predicts which of the two designated 
events will occur. The events E; and 
E, are presented in a random sequence 
with fixed probabilities and 1—z, 
respectively, and are in no way con- 
tingent upon the responses made by S 
or upon characteristics of the signal. 

In the present experiment, the 
signal was a pattern of lights on a 
panel. For one experimental group, 
each light on the panel occurred with 
probability 4, for a second group with 
probability %, and for a third group 
with probability 1. It was expected 
that values of the sampling ratio for 
the three groups would be directly 
related to these experimentally con- 
trolled probabilities of light occur- 
rences. 

A subsidiary problem investigated 
in this study was the role of overt 
responses to the reinforcing events, 
E, and Ez. On each trial S indicates 
his prediction as to whether E; or E, 
will occur by operating the appro- 
priate one of a pair of telegraph keys; 
these responses are designated A; and 
Ax Now, in giving a_ theoretical 
treatment of learning in this situation 
(2), it has been assumed that the 
occurrence of E; or E; at the termina- 
tion of a trial is followed by a response, 
probably verbal, which belongs to 
class A; or As, respectively. These 
inferred responses are the analogues 
of the UR in classical conditioning, 
and the analogy between “verbal 
conditioning” and classical condition- 
ing is rounded out if an overt A; or Az 
is required at the end of each trial. 
In the present experiment compari- 
sons will be made between. groups 
which are required to make these 





C. J. BURKE, W. K. ESTES, AND S. HELLYER 


overt responses and otherwise com- 
parable groups which are not. 


ForMAL THEORY 


At the introduction of the signal on 
any trial, there are two stimulus sets 
present: Sz, representing stimulation 
associated with the signal lights, and 
Sz, uncontrolled stimulation perhaps 
produced largely by the reinforcing 
events and S’s responses on immedi- 
ately preceding trials. We shall rep- 
resent the sizes of these sets by Nz 
and Nz and let the probabilities that 
any element in either set is sampled by 
the organism on any given trial be, 
respectively, 0; and 67. We see that 
N16, and Nz6z are the numbers of 
elements from S, and Sz expected in 
the sample. If welet N be the number 
of elements in the combined set, 


N=Ni+Niz (1) 


and 6 be the mean value of @ in the 
combined set 


J 1 
6= wN 16x + N77) (2) 


then N6 is the expected number of 
elements from the combined set in the 
sample. 

We assume that every element in 
each set is conditioned either to A; or 
Az and that the sole function of the 
events E, and E; is to evoke responses 
equivalent to A; and A, at the end of 
each trial. From simple contiguity 
it therefore follows that on an E; trial 
all elements sampled are conditioned 
to A; and on an E; trial all elements 
sampled are conditioned to A:; this 
means that, on an E; trial, elements 
conditioned to A; at the start of the 
trial will become disconnected from 
A; and connected to As, provided 
only that they are in the sample on 
thetrial. We let Fz(n) and Fz(n) be, 


respectively, the proportions of ele- 








er aryl Wwe ee RWS SS FE EES ee es STO” 


dl 





VERBAL CONDITIONING 155 


ments in S; and Sz connected to A; at 
the start of the nth trial. 

The probability, p(n), that re- 
sponse A, will occur on the (n + 1)st 
trial has been defined (1) as the 
weighted mean of these proportions; 
in the present case: 


1 
p(n) = Wo LN iF 1(n) 
+ NizFz(n)). (3) 


Whenever F,(n) and Fz(n) are known, 
p(n) can be calculated. These pro- 
portions can be defined recursively as 
follows: 


Fy(n+1)=(1—61)F 1(n) +67 (4) 
Z(n+1) = (1-62) Fz(n) +62 


on an E; trial, and: 


Fyr(n + 1) = (1 -~ 61) F 1(n) (5) 
Fr(n + 1) = (1 — 64) Fz(n) 


on an E; trial. To illustrate the 
reasoning underlying these equations, 
we may discuss the pair for Fz. The 
proportion 1 — 6, of the elements of 
S:, is not sampled and the status of 
these unsampled elements does not 
change. The proportion 6, is sampled 
on the trial and either all or none of 
these will be connected with A, at the 
end of the trial according as E, or Ez 
occurs. We know that Equations 4 
are applicable on a proportion of the 
trials and Equations 5 on the re- 
mainder. Thus we may write for the 
average proportion of elements con- 
nected to A;: 


Py (n+1)=(1—61)F L(n)+0r8 (6) 
Pz(n+1)=(1—67)Fr(n)+6zr. 


It can be readily verified by substitu- 
tion that: 
Pi (n)=x—[4e—F (0) ](1—62)" (7) 
Fr (n)="—[x—Fz(0)](1—67)" 
are solutions of Equation 6 which 
yield appropriate initial values. The 


general expression for p(n) is obtained 
by substitution of Equation 7 into 
Equation 3, giving: 


l 
p(n) =  — 5 [e — 9(0)] 


X(N 101(1 — 61)" 
+ Nyer(1 — 67)"). (8) 


In writing Equation 8 we have made 
the assumption that the proportions of 
elements initially connected to A, are 
the same in the two sets, or formally: 


p(0) = F.(0) = Fr). (9) 


Equation 8 is an expression for the 
mean learning curve in the verbal 
conditioning experiment as derived 
from statistical learning theory. It 
may be noticed that if the distinction 
between the two sets S; and Sz is not 
made, then this expression reduces to 
the one utilized as a first approxima- 
tion in previous work (2). 

Another equation of considerable 
interest is obtained. by summing both 
sides of Equation 8 from 0 to n — 1. 
This sum is the expected number of 
A, responses in the first trials and is 
given by: 


R(n) = = 90) 


1 
= nx — x3 [r — 7(0)] 


x {Ni[1 — (1 — 6z)*] 
+ Nz{l — (1 — 6z)*"J}. (10) 


We shall derive some theorems and 
corresponding experimental predic- 
tions from our basic equations in the 
following section. 


THEOREMS AND PREDICTIONS 


Throughout this section we shall 
consider that N,, Nz, and 0z are fixed 
so that, for fixed n, the quantities 
p(n) and R(n) are functions only of 





156 


the parameter that we plan to vary 
experimentally, 6,. 

Theorem I.—As n increases, R(n) 
approaches asymptotically a linear 
function which depends only upon 6 
and directly observable magnitudes. 
This theorem is verified by noting 
that, since (1 — @z) and (1 — 6%) lie 
between 0 and 1, the quantities 
(1 — 6z)* and (1 — 6%)” tend to 0 
and Equation 10 reduces to 


R(n) = ne — [x — p(0)1/6. (11) 


This means that 6 can be estimated 
whenever the other quantities are 
known for a value of m such that p(n) 
is near asymptote. 

Theorem II.—For sufficiently large 
values of n, R(n) is an increasing 
function of 6, if  — p(0) is positive 
and a decreasing function if  — (0) 
is negative. The proof of this is ob- 
tained by substituting Equation 2 
into Equation 11. We might remark 
that this result holds in general only 
for large n; without special conditions 
imposed on the 6 values neither p(n) 
nor R(n) is necessarily monotonic in 
6, for all values of n. 

Theorem III.—As n increases, p(n) 
approaches the value x and becomes 
independent, therefore, of @z. Since 
the quantities (1 — @,) and (1 — @z) 
are fractions between 0 and 1, this 
follows easily from Equation 8. 

Prediction I.—According to Equa- 
tion 10, the rate of learning, as 
reflected in the cumulative number of 
A; responses, should be affected if the 
value of 6, is varied experimentally. 
Unless all of the parameter values are 
known, it is not possible to predict in 
advance the precise ordering that 
would be exhibited on early trials by 
groups having different 6, values. 
It can, however, be predicted that for 
a sufficiently large number of trials, 
the cumulative A; frequency should 
be an increasing function of 6, if 





C. J. BURKE, W. K. ESTES, AND S. HELLYER 


x>p(0) and a decreasing function if 
a<p(0). 

Prediction II.—Following a large 
number of trials, the proportion of A; 
responses in any trial block will be 
independent of 6. 

Prediction III.—Let 6;, 02, and 63 
be three specified values of @, and 
suppose they are related by two 
independent linear relations 


ki; = kobe = 03, (12) 


where &; and k, are known constants. 
Suppose further that verbal condition- 
ing experiments have been carried out 
to a value of n sufficient for the appli- 
cation of Theorem I with 6, equal to 
6; in one experiment and 6; equal to 
62 in the other, so that corresponding 
values of 6, say 6, and 62, have been 
estimated. Then we have from Equa- 
tions 2 and 12, 


6, = N16,/N + Nzz/N; (13) 
A, = N162./N + Nyr/N 


k 
= (N16,/N) i + N7z0z/N. 


All quantities in Equations 13 are 
known except for Nz ,6:/N and 
Nzort/N; the equations are linear 
and can easily be solved for these two 
unknowns. When this has been done 
it is possible to compute 4; from: 


6; = N16;/N + Nzez/N 
= (N10:/N)ki + Nxez/N. (14) 


If the theory is adequate, knowledge 
of 6; enables prediction of the value 
of R(n), again for sufficiently large n, 
in a new experiment where 6, is given 
the value 63. 

The experiment described below 
was designed mainly to test these 
three predictions. 


METHOD 


Apparatus.—The apparatus has already been 
described in detail (2), and only the salient 
The experi- 


features will be repeated here, 











we aa we 








VERBAL CONDITIONING 157 


mental room was dark except for the stimulus 
lights in the apparatus, and it contained four 
booths and a signal board. On the signal board 
were mounted twelve 12-v., .25-amp. light bulbs 
evenly spaced about a circle of 9-in. radius. 
Any sample of one or more of these could serve 
as a signal for the response. Within each booth 
was a pair of telegraph keys, each key directly 
beneath a “reinforcing” light. The presenta- 
tions of the signal and of the “reinforcing” 
stimuli were made by an automatic program- 
ming device, and Ss’ responses were recorded 
automatically. ; 

Design.—Six groups of Ss were run, three in 
each of two successive experimental procedures. 
In the first procedure, Ss were not instructed to 
respond to the “reinforcing” lights with overt 
key presses. In the second, they were instructed 
so to respond. Within each of these procedures, 
the three groups were differentiated by the sta- 
tistical properties of the signal. In Group I a 
random sample of the 12 signal lights was drawn 
on each trial and a probability of 4 was assigned 
to each light. In Group II the same procedure 
was followed, but with a probability of } for 
each light. In Group III, the probability was 1, 
so that all 12 lights appeared as a signal on every 
trial. In all groups at least one light was pre- 
sented on each trial. The probability that any 
stimulus element associated with any light on 
the board will occur on any given trial is the 
product of the conditional probability that it 
will occur if the light is sampled and the prob- 
ability that the light will be sampled. By vary- 
ing the signal probability we have insured that 
the values of 6, for the three groups will satisfy 
Equation 12 with ki = 3 and kp = }. 

Subjects —The Ss were 72 students obtained 
from beginning courses in psychology during the 
academic year 1952-53; there were 12 Ss in each 
experimental group. The Ss were scheduled in 
groups of four for the experimental sessions, and 
each group of four was assigned at random to one 
of the signal probabilities, subject to the restric- 
tion that there be an equal number of groups 
under each signal probability. When one or two 
Ss missed appointments, the remainder of the 
group was run and the missing Ss were subse- 
quently run. In no case, however, was any S 
run alone—data were collected only with two 
or more of the booths occupied. 

Procedure——When Ss had been seated in the 
booths, they were read instructions, similar in 
detail to those reported elsewhere (2), to the 
effect that they were to respond to the signal 
pattern on each trial by operating the telegraph 
key corresponding to the reinforcing light that 
they expected to follow. In addition, Ss as- 
signed to the overt response procedure were in- 
structed to respond to the reinforcing light on 


each trial by operating the appropriate key. 
After the instructions had been read, the room 
was darkened and four practice trials were given, 
with £ correcting any errors he detected. For 
each S one of the two “reinforcing” lights was 
designated as E;, the other as Ex. The positions 
of E; were counterbalanced, right and left, in 
each group of four Ss. During the four practice 
trials, the lights were given to every S in the 
order E;, E;, Ex, Ex, After the practice trials 
the instructions were reviewed by E. The re- 
corder was started and 120 trials were given to 
each S. These trials were run off in continuous 
sequence. For each set of four Ss a random 
sequence of E,’s and E»’s was generated, with the 
probability that any trial would be an E; fixed 
at = 0.1. Thus, the four Ss in each group 
had the same pattern of signal lights and the 
same sequence of reinforcing lights, save that 
the side designated as the E; light was counter- 
balanced on left and right. 

The time relations during the run were: 
Duration of signal—2 sec.; interval from cessa- 
tion of signal to onset of “reinforcing” light—1 
sec.; duration of “reinforcing” light—0.8 sec.; 
intertrial interval—0.4 sec. 


REsuULTs AND Discussion 


Analyses of variance.—Analyses of 
variance were performed on the num- 
ber of A; responses made by each S 
during the first 20 and the last 40 of 
the 120 trials. These blocks were 
selected because the first should be 
sensitive to the differences during 
early learning and the second should 
be sensitive to any differences that 
remain when Ss are performing near 
asymptote. The analysis was done in a 
factorial design with the side of the E, 
light added to the two basic variables 
of the experiment. 

The effect of the overt response.— 
Neither in the early stages of learning 
nor in the vicinity of the asymptote 
does the behavior seem dependent 
upon the inclusion of an overt re- 
sponse following the presentation of 
E, and E,. In the analyses of vari- 
ance for the first 20 and the last 40 
trials, the F’s for this variable are, 
respectively, .26 and .24, with 1 and 
60 df in each case. It was suspected 





158 


that some tendencies toward alter- 
nation behavior might be introduced 
with the overt response, but analysis 
of conditional probabilities in the 
sequence of trials has also failed to 
reveal any important effect. We 
shall consider ourselves justified in 
grouping the data from the two re- 
sponse procedures in subsequent dis- 
cussion. 

Prediction I.—This prediction is 
that the signal probability groups 
should differ in learning rate and that, 
for sufficiently large values of n, as 
the signal probability is increased, the 
value of R(n) should decrease. In 
the analysis of variance for the first 
20 trials, an F value of 3.35 with 2 and 
60 df was found, indicating that the 
sampling ratio has had an effect, 
significant at the 5% level, on the 
number of A; responses in the early 
stages of learning. According to Pre- 
diction II, the level of significance in 
this comparison should be attenuated 
if the data from an extended number of 
trials are used. For, beyond a cer- 
tain point, the inclusion of further 
trials will increase the error term but 
make no contribution to the group 





30 
34 
233 
20 “ae 
vs 
ake 
Rn) ewe 
‘i We 
7 











Fic. 1. Observed and theoretical curves 
representing cumulative mean frequency of the 
less frequently reinforced response for each sig- 
nal-probability group. 





C. J. BURKE, W. K. ESTES, AND S. HELLYER 


differences. Thus, this analysis of 
variance comparison is properly con- 
fined to the first block of 20 trials on 
a priori grounds. It should be noted 
that we could not have stated in ad- 
vance of the experiment that differ- 
ences in the observed direction would 
develop during the first 20 trials. If 
for example, the value of 67 were large 
relative to the values of 6,, then the 
prediction would not follow. In a 
later section, however, it will be shown 
that the value of 67 estimated from the 
data is approximately equal to the 
smallest of the 6, values, and given 
this information the prediction does 
follow, both for the present data and 
for any further experiments run in 
the same situation. 

In Fig. 1 the mean cumulative 
number of A; responses for each 
sampling ratio group is plotted against 
number of trials. The fitted curves 
have been obtained by methods 
described below. The number of A; 
responses has been accumulated 
through each block of ten trials. It 
is seen from Fig. 1 that the data 
points fall in the predicted order at 
the end of every cumulative block. 
Since successive points are not statis- 
tically independent, it is difficult to 
assign a level of confidence to this 
result, but the evidence seems im- 
pressive. The nonindependence of 
cumulative data points could in itself 
lead to persistence of accumulated 
differences; however it would have no 
influence on the order of the groups or 
on the asymptotic slopes, and in 
these data both order and asymptotic 
slope accord with theoretical predic- 
tion. Finally, in Table 1 are given the 
total number of A; responses for each 
sampling ratio under each response 
procedure. If we consider the two 


response procedures as replications, 
we see that we have twice obtained 
the predicted order, 








a” 





VERBAL CONDITIONING 159 


TABLE 1 


Tora, Numser or A; RESPONSES FOR 
Eacn Experimenta Group 














Overt Response 
Signal 
Probability 
Absent Present Total 
327 332 659 
309 275 584 
1 295 267 562 














Prediction IT.—After a large num- 
ber of trials, the proportion of A, 
responses in any block of trials is 
asserted to be independent of the 
sampling ratio. This is supported by 
the fact that the trends of the data 
points in Fig. 1 are very nearly parallel 
over the latter half of their courses. 
It receives additional substantiation 
from the very small value of F, .19 
with 1 and 60 df, obtained for the 
effect of the sampling ratios on the 
number of A, responses in the last 40 
trials. Other investigators have, 
without exception, found that the 
proportion of A; responses in similar 
experiments is, after a considerable 
number of trials, near w (2, 3, 4, 5, 6) 
but have not dealt with a variable 
signal. 

Prediction III.—It is asserted that, 
for reasonably large values of n, the 
average number of A; responses in the 
first n trials made by Group III is 
predictable from the data of the other 
two groups. To test this prediction, 
the computations indicated in Theo- 
rem I and the formal statement of 
Prediction III were made at each 
tenth value of n in the latter half of 
the data. The results, together with 
the corresponding observed values of 
R(n) are given in Table 2. The 
prediction is seen to be surprisingly 
close to the behavior. It should be 
emphasized that no use has been 
made of the observed data from Group 
III in calculating these theoretical 


values. Had not experimental con- 
siderations indicated the desirability 
of running the three groups con- 
currently, the theoretical values could 
have been calculated before the em- 
pirical values were obtained. 
Parameter estimates.—It is possible 
to utilize the theory in conjunction 
with the records of the behavior to 
recover and exhibit the statistical 
properties of the stimulating situation. 
In Table 3 the statistical description 
of the stimulating situation is given. 
Not all aspects of the data were used 
in obtaining these values; they were 
obtained from the average cumulative 
number of A; responses over the entire 
120 trials in each of the three groups 
and from the trend of the data for 
Groups II and III from the 70th to the 
100th trial. In fitting these data, it 
was assumed on the basis of evidence 
contained in the data that for Groups 
II and III the quantity (1 — @z)* was 
negligible when n was 70 or larger. 
The theoretical curves in Fig. 1 are 
based on the value’ of the parameters 
given in Table 3. The curves are in 
general consistent with the data points 
and tend to substantiate the state- 
ment that the predicted effects of the 
sampling value are present. 
Inspection of Table 3 reveals a 
probable explanation for the relatively 
small effects of variation in the sam- 
pling ratio, namely the fact that only a 


TABLE 2 


PrepicTeD AND Osservep AVERAGE NuMBER 
or A; Responses in n TrIALs FoR LARGE 
VALUES OF ” AND FOR A SIGNAL 
PropasBi.ity or 1 











» Predicted R(n) | Observed R(n) 
70 16.8 16.9 
80 17.9 18.2 
90 19.2 19.5 
100 20.4 20.8 
110 21.1 22.1 
120 22.4 23.3 

















160 C. J. BURKE, W. K. ESTES, AND S. HELLYER 


TABLE 3 


EstimmaTep PARAMETER VALUES 








Sampling Ratio Group 








4 i i 
Prop. elements in Sz, (N1/N) 0.191 0.191 0.191 
Prop. elements in Sz (Nz/N) 0.809 0.809 0.809 
Sampling ratio over ST (@z) 0.0261 0.0261 0.0261 
Sampling ratio over Sz (6x) 0.0229 0.0458 0.0687 
Sampling ratio over S @) 0.0255 0.0299 0.0342 
Relative weight of Sz; in sample (N 161/N@) 0.1722 0.2940 0.3835 
Relative weight of Sz in sample (Niz6z/N@) 0.8278 0.7060 0.6165 

















relatively small proportion, approxi- 
mately one-fifth, of the total popula- 
tion of stimulus elements available 
for sampling at the onset of a trial in 
this experimental situation is asso- 
ciated with the signal lights. If by 
means of suitable procedures, e.g., 
increasing the intertrial interval, one 
succeeded in increasing the relative 
weight of the stimulus set associated 
with the controlled signal, or CS, 
sufficiently so that the effects of un- 
controlled stimulation were negligible, 
then the entire theoretical treatment 
would be greatly simplified. Equa- 
tion 8 would reduce to the simple 
growth function 


p(n) = x — [x — (0) 1-9)" (15) 


where @ is the sampling ratio asso- 
ciated with the signal; and both p(n) 
and R(n) would be monotonically in- 
creasing functions of @ for all n. 

Other studies of constant vs. varied 
stimulation.—In the light of our analy- 
ses, it is not surprising that earlier 
studies of this problem (7, 8) have 
yielded conflicting interpretations. 
It appears that variability of stimu- 
lation plays a significant but rather 
complex role in learning experiments. 
According to the statistical theory, 
rate of learning should be inversely 
related to degree of stimulus varia- 


bility, but with the point at which the 
curve of response probability for a less 
variable condition begins to surpass 
the curve for a more variable condition 
depending upon the relative weight of 
the controlled experimental stimulus 
in the total stimulus complex. In the 
set of experiments reported by Leeper 
and Leeper (7) learning rate was found 
to be inversely related in some cases 
and unrelated in others to stimulus 
variability. Owing to the complex- 
ity of the situations they studied, it is 
difficult to say whether or not the 
inconsistencies in their results could 
be accounted for in terms of the 
present theory. Wolfle’s study of the 
maze situation (8) yielded results that 
seem to be generally in accord with 
theoretical expectations. The value 
of the asymptote parameter in the 
empirical equation fitted to Wolfle’s 
group learning curves was the same 
for all groups, while the slope param- 
eters were inversely related to degree 
of stimulus variability. 


SUMMARY AND CONCLUSIONS 


We have reported a verbal conditioning ex- 
periment in which the signal, or CS, was a 
random sample from a group of 12 individually 
manipulable lights. Within each experimental 
group the probability for any light to be sampled 
on any trial was constant and was the same for 
every light. Two sets of three groups were run 

















VERBAL CONDITIONING 161 


corresponding to. probabilities of 3, 3, and 1. A 
secondary purpose was to test for influence of 
overt responses following the “reinforcing” 
stimuli on the guessing behavior, and the two 
sets of three groups were differentiated with 
respect to the presence or absence of such re- 
sponses. Twelve college students were run as Ss 
in each of the six groups. The “reinforcing” 
stimuli were lights on the right and left of S. 
The S was to indicate his prediction as to which 
of these lights would follow the signal on each 
trial by pressing a telegraph key directly beneath 
it. For each S a designated member of this 
pair of lights was given a probability of 0.1 and 
the other member a probability of 0.9 of appear- 
ing following the signal on each trial. 

Some fundamental considerations in sta- 
tistical learning theory led to predictions of the 
effect of the variability in the signal and ulti- 
mately to the design of the experiment. Three 
predictions were formulated and tested: (a) 
Learning rate depends upon stimulus variability 
in such a way that cumulative frequency of the 
less frequently reinforced response over a suffi- 
ciently long series of trials is inversely related to 
magnitude of the signal probability. (b) The 
learning curve, in terms of relative frequency of 
a given response per trial block, tends to an 
asymptote that depends upon probability of 
reinforcement but is independent of the signal 
probability. (c) Total frequency of a given 
response over a sufficiently long series of trials 
can be predicted for any group once the necessary 
parameters have been evaluated from data of the 
other two groups. 

With respect to Prediction a, tests from inde- 
pendent aspects of the data substantiate the 
presence of the specified effects at better than 
the 5% level. Predictions b and ¢ are verified 
in detail. 


Finally, the requirement of overt responses 
following the “reinforcing” lights is found to 
have little, if any, effect upon the behavior. 


REFERENCES 


1. Estes, W. K., & Burxe, C. J. A theory of 
stimulus variability in learning. Psychol. 
Rev., 1953, 60, 276-286. 

2. Estes, W. K., & Straucuan, J.H. Analysis 
of a verbal conditioning situation in terms 
of statistical learning theory. J. exp. 
Psychol., 1954, 47, 225-234. 

3. Grant, D. A., Haxe, H. W., & Hornsets, J. 
P. Acquisition and extinction of a verbal 
conditioned response with differing per- 
centages of reinforcement. J. exp. Psy- 
chol., 1951, 42, 1-5. 

4. Hake, H. W., & Hyman, R. Perception of 
the statistical structure of a random 
series of binary symbols. J. exp. Psy- 
chol., 1953, 45, 64-74. 

5. Humpnreys, L. G. Acquisition and extinc- 
tion of verbal expectations in a situa- 
tion analogous to conditioning. J. exp. 
Psychol., 1939, 25, 294-301. 

6. Jarnvix, M. E. Probability learning and a 
negative recency effect in the serial antici- 
pation of alternative symbols. /. exp. 
Psychol., 1951, 41, 291-297. 

7. Leeper, R., & Leeper, D. O. An experi- 
mental study of equivalent stimulation in 
human learning. J. gen. Psychol., 1932, 
6, 344-377. 

8. Wotrte, D. L. The relative efficiency of 
constant and varied stimulation during 
learning. III. The objective extent of 
stimulus variation. J. comp. Psychol., 
1936, 22, 375-381. 


(Received December 30, 1953) 








Journal of Experimental Psychology 
Vol. 48, No. 3, 1954 





THE CS-UCS INTERVAL IN GSR CONDITIONING! 


GEORGE MOELLER 
State University of Iowa* 


The study reported here was con- 
cerned with the relationship between 
the CS-UCS interval and level of 
performance for the trace-conditioned 
Féré GSR. As such, it supplements 
that dealing with the delayed-con- 
ditioned GSR (impedance change) 
recently reported by White and 
Schlosberg (11). Both of these in- 
vestigations were designed to deter- 
mine whether or not the optimal CS- 
UCS interval is the same for the con- 
ditioned GSR, supposedly mediated 
only by the autonomic nervous sys- 
tem, as for responses mediated by the 
central nervous system. 

White and Schlosberg found the 
relation between the CS-UCS interval 
and level of performance for the long- 
latency GSR to be similar to that for 
short-latency responses mediated by 
the central nervous system. On the 
basis of these findings, they claim that 
current S-R theories must be rejected. 
This conclusion apparently rests upon 
the assumption that the latency of the 
response conditioned must affect the 
function relating the CS-UCS interval 
to level of performance if S-R theories 
are correct. For reasons which will 
be discussed later, the present author 
does not believe that the GSR as 
measured by White and Schlosberg, 
or in the experiment reported here, 
provides the conditions necessary for 
an adequate test of S-R theories. 

While White and Schlosberg were 


1 This article was abstracted from a disserta- 
tion submitted to the Graduate College of the 
State University of Iowa in partial fulfillment of 
the requirements for the Ph.D. degree. The 
writer wishes to express his gratitude to Pro- 
fessors J. S. Brown and K. W. Spence for their 
aid and advice, which made this research 


possible. 
*Now at Connecticut College. 


interested in the S-S vs. S-R contro- 
versy, the present author hoped to 
reconcile two sets of data. Specifi- 
cally, the failures to find conditioning 
with interstimulus intervals of 2,500 
msec. in studies of that variable 
(4, 5, 8, 10, 11) do not fit with the 
numerous reports of successful con- 
ditioning with longer interstimulus 
intervals (e.g., 1, 2, 7, 9). One 
difference between the two classes of 
experiments is that those concerned 
with the CS-UCS interval had, prior 
to the White-Schlosberg study, dealt 
with CNS-mediated responses, while 
those in which conditioning with a long 
CS-UCS interval was successful in- 
volved responses supposedly mediated 
only by the autonomic nervous sys- 
tem. Accordingly, it was thought 
that the present study would reveal 
that the function relating performance 
to the interstimulus interval is shifted 
upward on the time axis, or ap- 
proaches zero more slowly, for auto- 
nomic responses. In order to test 
this hypothesis, four conditioning 
groups, each trained with a particular 
interstimulus interval, and a pseudo- 
conditioning control group were em- 
ployed in this investigation of the 
Féré GSR. 


METHOD 


Apparatus—The GSR was recorded as a 
change in apparent resistance by means of a 
modified version of the circuit described by 
Haggard and Gerbrands (3). A 20,000-ohm 
potentiometer was employed as a shunt across 
the recording element, instead of the 10,000-ohm 
value recommended by Haggard and Gerbrands, 
in order to decrease the period of the recording 
instrument. Pieces of sponge, saturated with 
1% zine sulfate electrode paste and placed in 
electrodes similar to those used by Haggard and 
Gerbrands, provided the contacts with S’s palms. 

Three decade-type electronic interval timers 
controlled the presentation of the stimuli. The 


162 











GSR CONDITIONING 


CS was a 100-msec. “white noise,” presented by 
means of headphones at 20 db above each S’s 
threshold. The UCS was provided by the dis- 
charge of a 4ufd. condenser for 100 msec. 
through the tips of the index and middle fingers 
of S’s left hand and a series resistance of 78,000 
ohms. The S’s fingers were moistened with 
electrocardiograph paste in order to insure con- 
tact over the entire surface of the polished silver 
electrodes (each 1.7 cm. in diameter). Shock 
intensity was controlled by varying the voltage 
to which the condenser was charged from 130 
to 380 v. , 

During preliminary work it was found that S 
often became bored and drowsy, and conse- 
quently exhibited an exaggerated startle when 
the stimuli were presented. Therefore, each S 
was required to give one free association to each 
of 50 nouns presented on a Hull-type memory 
drum at the rate of one noun every 30 sec. 
Nouns which might have affective connotations 
for a majority of the Ss were not used. Condi- 
tioning and test trials were interpolated between 
presentations of the words. 

Subjects —The Ss were 75 males from an 
introductory psychology course; they were as- 
signed in equal numbers to five experimental 
groups in a random order. Since the recording 
circuit operated only within the range of 5,000- 
ohms to 65,000-ohms basic resistance, Ss whose 
basal level lay outside this range were dismissed. 
Eight Ss were dismissed during the second day’s 
session because their GSR’s fluctuated widely 
and rapidly, and accordingly could not be re- 
corded. The data for two other Ss were dis- 
carded when it was found that they failed to 
respond to the CS at any time, including the 
adaptation trials, during the experimental 
session. 

Procedure-—Five groups of 15 Ss each were 
employed in this study. Four of the groups 
were conditioned at interstimulus intervals of 
250, 450, 1,000, or 2,500 msec., and one group 
served as a pseudoconditioning control. The 
duration of the interstimulus interval was 
measured from the onset of the CS to the onset 
of the UCS. 

For the control group, the CS followed the 
UCS by 3 to 5 sec. on all but test trials. Accord- 
ingly, the forward CS-UCS interval for this 
condition varied in random fashion between 21 
and 61 sec. 

Each S served for 1 hr. on each of two suc- 
cessive days. On both days S was seated alone 
in a sound-shielded room. Any communication 
between S and E after the reading of the in- 
structions and adjustment of the apparatus was 
through an intercommunication system. The 
instructions were designed to conceal the fact 
that this was a conditioning experiment; S was 
told that the study was concerned with the 


163 


effects of intersensory stimulation upon the 
auditory threshold. 

On the first day, all Ss received the same 
training. The “white noise” was presented five 
times by the method of limits in order to obtain 
a rough measure of S’s auditory threshold. The 
CS was set 20 db above this threshold for the 
remainder of the experiment. The shock was 
then adjusted over three trials to an intensity 
that S considered distinctly unpleasant. During 
both sessions, the charge on the condenser was 
increased by 10 v. on every fourth ordinal trial. 
A supplementary experiment indicated that 
conditioning is identical with either constant or 
increasing shock when the CS-UCS interval is 
450 msec. (6). After the adjustment of the CS 
and UCS, S received 8 adaptation trials (CS 
alone) followed by 15 pseudoconditioning trials 
with 6 test trials interspersed. 

On the second day, 3 trials were devoted to 
readjustment of the shock; then S received 8 
adaptation trials, 23 conditioning trials, with 7 
test trials interspersed, and 5 extinction trials. 
The control group received pseudoconditioning 
trials instead of the conditioning trials given the 
other groups. The test trials were presented 
every second or third trial after each increase in 
shock, with order of testing varied randomly 
from S to S. 

The mean intertrial interval was approxi- 
mately 45 sec. The intervals were chosen so 
that each trial followed a response to the word 
list by 8, 10, or 12 sec., and preceded the presen- 
tation of the next noun by at least 18 sec. A 
particular sequence of intervals was assigned at 
random to each S. 

Response measure criteria——The data pre- 
sented in this report are based on measures 
obtained during the seven test trials, and five 
extinction trials of the second experimental 
session. The first extinction trial was treated 
as the eighth test ‘trial. Measures of antici- 
patory responses were not employed since the 
usual GSR latency of 1-3 sec. might have biased 
the results in favor of the groups conditioned 
with longer interstimulus intervals. Decre- 
ments in resistance of 200 ohms or more were 
recorded as responses if their latency with respect 
to the CS was at least 1 sec., and no more than 
5 sec. plus the appropriate interstimulus inter- 
val. The longest latency allowed a conditioning 
group, 7.5 sec., was used in scoring the responses 
of the pseudoconditioning group and adaptation- 
trial responses of all groups. 

Amplitude of response was measured as the 
difference between conductance at the onset of 
the response and at the first clear-cut maximum 
reached within 7.5 sec. thereafter. Each S’s 
measures were reduced by the mean amplitude 
of his responses during the last three adaptation 
trials in order to remove the effects of “random” 





164 GEORGE MOELLER 


responses and individual differences in initial re- 
sponse strength. Frequency of response was de- 
termined from the corrected amplitude measures. 


A square root transformation, y = Vv tf + 4, was 
applied to each S’s frequency scores to make 
those data amenable to statistical tests. 


RESULTS 


Performance curves showing the 
mean corrected amplitude of response 
for blocks of two test trials are pre- 
sented in Fig. 1. The amplitude meas- 
ures of the four groups serving under 
a conditioning regimen all exhibited 
an initial increase, whereas the re- 
sponses of the control group were con- 
sistently smaller than they had been 
during adaptation. The performance 
of the 450-msec. group is superior 
throughout conditioning and extinc- 
tion, while that of the 2,500-msec. Ss 
declined after 11 reinforcements. 
With the exception of the 2,500-msec. 
condition, for which performance im- 
mediately dropped to the level of the 
control group, the relative efficacy of 
the several intervals was maintained 
during extinction, the 450-msec. Ss 
giving responses of greater amplitude 
during extinction than in conditioning. 


GR 





5- P 
---<« M q Oma 
—1- 
4 —_ a 
@=-=-@ CONTROLS |.» 
— oo) 
es 
= 
z 
= 2 
4 
z 
al 
UV 
z 
¥ te) 
<1 
2 i 1 1 ood 








CONDITIONING EXTINCTION 
BLOCKS OF TEST AND EXT. TRIALS 


Fic. 1. Mean corrected conductance change 
(micromhos) in blocks of two test trials and 
extinction trials. 4 indicates mean corrected 
conductance change for the last three adapta- 
tion trials. 








4- 
3 
3 
477 
< 
Y ik 
2 
5 oF 
ae ore 
i i i i 
© 250 450 1000 2500 


CS-UCS INTERVAL (MILLISECONDS) 


Fic. 2. Mean corrected conductance change 
(micromhos) for the last block of four test trials 
as a function of the interstimulus interval em- 
ployed in conditioning. Control group per- 
formance is indicated by the dashed line C. 


The most reasonable interpretation of 
these data seems to be that the 250-, 
450-, and 1,000-msec. groups were 
conditioned, but that the performance 
of the 2,500-msec. group exemplifies 
sensitization. 

The relation between mean ampli- 
tude of response for the last four test 
trials and the interstimulus interval is 
shown in Fig. 2. An analysis of 
variance of these measures was signi- 
ficant at the .025 level of confidence. 
Tests of treatment pairs yielded ?’s 
which were significant for compari- 
sons of the 250-, 450-, and 1,000- 
msec. conditions with the controls at 
the .01, .01, and .05 levels of confi- 
dence, respectively. The p for the 
comparison of the 450- and 2,500- 
msec. groups was .025. None of the 
remaining differences between treat- 
ment pairs was significant at the .05 
level. 

The frequency measures yielded 
roughly the same picture as did the 
amplitude measures, except that the 
250-, and 2,500-msec. conditions were 
superior to the 450- and 1,000-msec. 
conditions during the first four test 
trials.* In the later stages of training 
and during extinction, the rank order 


* The data summarized in this report are pre- 
sented in their entirety in (6). 

















see ar 














GSR CONDITIONING 165 


of the conditions was the same as that 
found with the amplitude measures. 
Since an analysis of variance cannot be 
appropriately applied to frequencies 
which range from zero to four, the 
groups were compared on the basis of 
mean transformed total frequency of 
response during conditioning. The 
F ratio for these data was just short 
of significance, p being .07. 


Discussion 


The CS—UCS interval in GSR condi- 
tioning.—The function relating CS-UCS 
interval and performance in trace condi- 
tioning of the GSR (Féré) was found to 
be very similar to that obtained in 
studies of the eyelid and other responses 
involving skeletal musculature. This 
supports the findings of White and 
Schlosberg regarding the delayed-condi- 
tioned GSR (impedance change). 

Unfortunately, this leaves the central 
problem of this research unanswered. 
Why is there no evidence of conditioning 
with a 2,500-msec. CS-UCS interval 
in studies of that variable when there 
are many reports of successful condi- 
tioning with longer interstimulus in- 
tervals? The discrepancy between these 
two sets of data does not appear to be 
due to differences in Ss, human vs. 
infrahuman, the nature of the CS, or in 
the number of conditioning trials em- 
ployed. 

The present experiment was designed 
to explore the possibility that the basis 
of the discrepancy lies in the type of 
response conditioned, specifically, that 
autonomic responses may be conditioned 
with much longer CS-UCS intervals than 
CNS-controlled responses. On the as- 
sumption that the GSR as measured 
in this experiment is a purely autonomic 
response, this hypothesis seems to be 
negated. However, these findings con- 
flict with Rodnick’s report (9) of success- 
ful trace and delayed GSR conditioning 
with 17.4-sec. and 20.1-sec. interstimulus 
intervals. 

The problem posed by Rodnick’s 
findings may, perhaps, be resolved if a 
hypothesis proposed by Brown is ac- 


cepted. Brown suggests that in the 
present experiment the response was 
perhaps not conditioned directly, but 
that a simple skeletal response, such as 
inspiration or tensing of the muscles, 
which results in a recorded GSR, was 
conditioned.‘ It is well established that 
such skeletal responses are accompanied 
by events which lead to a recordable 
GSR. Unfortunately, it was impossible 
to observe 'S in the sound-shielded room 
employed in this study, and to record his 
skeletal responses, if any. 

Brown’s hypothesis might lead to the 
prediction that in studies such as that 
reported here the relation between per- 
formance and the CS-—-UCS interval 
would be found to be similar to relation- 
ships obtained in investigations of the 
conditioned eyelid response, for the 
response actually conditioned is a skeletal 
response. This skeletal response in turn 
mediates the recorded GSR. On the 
other hand, it might, with the appro- 
priate subsidiary assumptions, lead to 
the prediction that a measure of pure 
autonomic conditioning would yield a 
function which was shifted upward on the 
time axis. : 

Implications for learning theories.— 
White and Schlosberg claim that their 
findings and, by extension of their argu- 
ment, those presented in this report, offer 
difficulties for Mowrer’s two-factor no- 
tion, Hull’s drive-reduction theory, and 
Guthrie’s S-R contiguity theory. Their 
major argument seems to be that if S-R 
theories, as opposed to S-S theories, are 
correct, a parameter dependent upon the 
type of response conditioned must appear 
in the function relating the CS—-UCS 
interval and response strength. In par- 
ticular, the optimal interstimulus in- 
terval for conditioning the GSR, which 
has a long latency, should be longer than 
for eyelid conditioning. White and 
Schlosberg demonstrate that the optimal 
interval for conditioning the GSR, under 
the conditions of their experiment, is 
identical with that found in studies of 
eyelid conditioning. As a consequence, 
they conclude that the generality of 
S-R theories must be questioned. If 


4 Brown, J. S. Personal communication, Janu- 
ary 2, 1953. 








166 


it can be shown that the GSR recorded 
under conditions like those employed 
by White and Schlosberg is, as suggested 
above, a resultant of conditioned mus- 
cular responses, their conclusion would 
be unsupported factually. 

In any case, there is evidence, e.g., 
Rodnick’s experiment, which indicates 
that conditioning of responses pre- 
sumably mediated by the autonomic 
nervous system may be carried out with 
interstimulus intervals longer than 2,500 
msec. That evidence seems to con- 
tradict the findings of White and Schlos- 
berg and the experiment reported here, 
that no conditioning is obtained with 
a CS-UCS interval of 2,500 msec. Until 
the apparent contradiction between the 
findings of the several experiments cited 
above is reconciled, there seems to be 
little point in discussing the implications 
of the White-Schlosberg study, or that 
reported here, for the S-R vs. S-S issue. 

Performance of the 2,500-msec. group.— 
The 2,500-msec. group exhibits an initial 
improvement in terms of both amplitude 
and frequency measures. This rise 
might be attributed to a temporary in- 
crease in general motivational level re- 
sulting from the abrupt introduction of 
a noxious stimulus, and the presentation 
of two stimuli in close temporal succes- 
sion. If the temporal relations of the 
stimuli are such that conditioning does 
not occur readily, the initial increase in 
response strength should be followed by 
a decline in performance as adaptation to 
the stimuli occurs. This formulation 
would also account for the steady decline 
in strength of a backward conditioned 
response found by Spooner and Kellogg 
(10) and Pavlov (7). 


SUMMARY 


In this experiment four conditioning groups, 
trained at CS-UCS intervals of 250, 450, 1,000, 
or 2,500 msec., and a pseudoconditioning control 
group were employed in a study of the role of 
the interstimulus interval in the trace condition- 
ing of the Féré GSR. Fifteen undergraduate 
men served in each group. All Ss served for 
two days, the first under a pseudoconditioning 
regimen, the second in one of the five experi- 
mental groups. On the second day, each S 
received 23 conditioning or pseudoconditioning 





GEORGE MOELLER 


trials, 8 test trials, and 4 extinction trials. The 
CS was a “white noise” 20 db above each S’s 
threshold. The UCS was an electric shock pro- 
duced by the discharge of a condenser. 

The optimal CS-UCS interval was found to 
be 450 msec., shorter and longer intervals yield- 
ing poorer performance. This finding appears 
to indicate that the optimal interval for condi- 
tioning does not vary with the overt response 
being conditioned. However, it is suggested 
that the GSR, as measured in this experiment, 
was the resultant of a conditioned skeletal re- 
sponse, and that a measure of pure autonomic 
conditioning would not give the same results. 

Finally, arguments are advanced in opposition 
to the view of White and Schlosberg that this 
type of study is crucial to current theoretical 
controversies. 


REFERENCES 


1. Brown, J.S.,& Jacoss, A. The role of fear 
in the motivation and acquisition of re- 
sponses. J. exp. Psychol., 1949, 39, 747- 
759. 

2. Estes, W. K., & Sxinner, B. F. Some 
quantitative properties of anxiety. /. 
exp. Psychol., 1941, 29, 390-400. 

3. Haccarp, E. A., & Gerpranps, R. An 
apparatus for the measurement of con- 
tinuous changes in palmar skin resistance. 
J. exp. Psychol., 1947, 37, 92-98. 

4. Kappaur, W. E., & Scutosperc, H. Con- 
ditioned responses in the white rat: III. 
Conditioning as a function of the length 
of the period of delay. J. genet. Psychol., 
1937, 50, 27-45. 

5. McAuuster, W. R. Eyelid conditioning 
as a function of the CS-UCS interval. 
J. exp. Psychol., 1953, 45, 417-422. 

6. Moetter, G. O., Jr. The role of the CS- 
UCS interval in conditioning the GSR. 
University Microfilms, Ann Arbor, Mich., 
Publ. No. 4090, 1952. 

7. Pavuov, I. P. Conditioned reflexes. 
don: Oxford Univ. Press, 1927. 

8. Reynotps, B. The acquisition of a trace 
conditioned response as a function of the 
magnitude of the stimulus trace. J. exp. 
Psychol., 1945, 35, 15-30. 

9. Ropnicx, E.H. Characteristics of delayed 
and trace conditioned responses. J. exp. 
Psychol., 1937, 20, 409-425. 

10. Spooner, A.,& Kettocc,W.N. The back- 
ward conditioning curve. Amer. J. Psy- 
chol., 1947, 60, 321-344. 

ll. Wurte, C. T.,& Scutosperc,H. Degree of 
conditioning of the GSR as a function of 
the period of delay. J. exp. Psychol., 
1952, 43, 357-362. 


Lon- 


(Received December 21, 1953) 














Vora, Wo hiss ne? 


THE EFFECT OF PRE-EXPERIMENTAL AND 
EXPERIMENTAL ANXIETY ON 
RECALL EFFICIENCY! 

REED M. MERRILL? 


Counseling Center, University of Washington 


The study reported here stems from 
two major areas of psychological 
interest and experimentation. The 
first is the concept of repression as 
originally developed by Freud (2, 
3). This approach laid the ground- 
work for Jung’s (7) word association 
method. Clinicians have generally 
accepted this technique and have at- 
tributed disturbances in associations 
to emotionally charged, repressed 
material. In line with repression 
theory it seems reasonable to use 
association tests as a means of study- 
ing the dynamics of retention since 
words with association disturbances 
should be more readily forgotten than 
words without association disturb- 
ances. Comprehensive reviews of re- 
pression have been made by Zeller 
(16), Sears (13), Gilbert (4), and 
Rapaport (10). The second area is a 
large number of studies evaluating 
the effects of experimental anxiety on 
recall. Most relevant to this study is 
the work of Laffal (8) and Zeller (17). 
Zeller attempted to develop an experi- 
mental analogue of repression by 
creating experimental anxiety and 
removing its effects on the ability to 
recall material previously learned. 

This study sets out to evaluate (a) 
the effect on recall of words with pre- 
experimental association disturbances 
as measured by the word association 
method, and (d) the effect on recall of 


1 Part of a thesis submitted in partial fulfill- 
ment of the requirements for the degree of 
Doctor of Philosophy in the Department of 
Psychology, University of Washington. The 
author wishes to express his appreciation to Dr. 
Louise B. Heathers for supervision of the re- 
search and review of this manuscript. 

2 Now at the University of Utah. 


an anxiety-producing task similar to 
Zeller’s. 


MeTHOD 


Subjects.—Eighty male, lower division college 
students served as Ss in the experiment. To 
assure obtaining naive Ss, they were selected 
according to age, educational level, and verbal 
ability as measured by the Linguistic Scale of 
the American Council on Education Psycho- 
logical Examination. Comparison of the experi- 
mental (E) and control (C) groups on these 
factors shows that the median age of Group E 
was 18.43 yr., as compared with 18.37 yr. for 
Group C; both groups contained 77.5% fresh- 
men and 22.5% sophomores; the mean ACE 
raw score was 74.05 (SD = 10.34) for Group 
E and 74.05 (SD = 9.89) for Group C. The 
Ss were assigned alternately to Groups E 
and C. Each S was seen twice. Originally an 
interval of three days was planned between 
these two sessions. This was impossible to 
maintain, but Groups E and C were quite com- 
parable on this variable; the median days’ inter- 
val between sessions of Group E was 3.50 days, 
as compared with 3.83 days for Group C. 

Procedure.—In the first session of the experi- 
ment a word association test was individually 
administered. Original reaction time, original 
response, reproduction reaction time, and repro- 
duction response were recorded. Timing was 
measured with a stop watch. Timing was to the 
nearest second. The association test was com- 
posed of 168 words taken from the association 
tests of Jung (7), Kent-Rosanoff (12), and 
Rapaport (11). The test was given with the 
usual instructions. 

Between the first and second experimental 
periods, E selected the list of 20 words to be 
presented in the learning situation and prepared 
the experimental materials. A different list of 
20 words was prepared for each pair of Ss. For 
each S in the experimental group E recorded all 
words that either reflected partial repression 
(disturbing words) or that were neutral in tone 
as measured by the word association test. 
Words selected as disturbing words had to 
meet at least one of the following conditions: 
lengthened original reaction time, false reproduc- 
tion, or unusual original response. Words 
selected as control words had to meet all of the 
following conditions: short original reaction 


167 








168 


time, perfect reproduction, and usual original 
response. The words selected as possible dis- 
turbing and neutral words for an experimental S 
were then checked against the responses of the 
paired control S. This comparison resulted in 
the exclusion of some words, because the ten 
disturbing words finally selected had to meet 
both the criteria for disturbing words for the 
experimental S and the criteria for neutral 
words for the paired control S. The range of the 
number of disturbing words discarded due to 
failure to meet this criterion was from 7 to 30 
words; the mean number of words discarded 
was 17.9 (SD = 5.44). In turn, the ten neutral 
words finally retained had to meet the criteria 
for neutral words for both the experimental and 
the paired control S. Table 1 compares 
Groups E and C on the words selected by these 
procedures. 

The learning task was similar for all Ss; it 
required the learning of the 20 words to one 
perfect reproduction by the anticipation method. 
The material was presented to S on a memory 
drum. The cue for each word to be learned was 
a nonsense syllable. The nonsense syllables 
used as cues were all taken from the 20% level 
of meaningfulness as determined by Hull (5). 
Four orders of presentation of the learning ma- 
terial were used to reduce serial learning effects. 
Disturbing and neutral words were alternated 
throughout the list; half the Ss began with 
neutral, half with disturbing words. 

The memory drum was constructed locally 
and was electrically driven. The timing was set 
for a 2-sec. interval between stimuli; thus, the 
nonsense syllable was presented for 2 sec. fol- 
lowed by the nonsense syllable paired with the 
response word for 2 sec. The S was allowed 2 
sec. to anticipate the response word; 4 sec. were 
allowed between trials to shift to the next order 
of presentation. 

When S appeared for the second session, he 
was given the usual instructions for learning 





REED M. MERRILL 


paired associates. He was not required to pro- 
nounce or spell the nonsense syllables nor was he 
told the criterion for learning. 

After attainment of the criterion of learning, 
an interpolated task was introduced prior to the 
first recall period. The purpose of this task was 
manifold: (a) to provide an activity of a quite 
different kind to facilitate forgetting of the word 
lists; (b) to prevent rehearsal of the lists; and (c) 
to evaluate the effect of two kinds of task on 
recall efficiency. Upon completion of the learn- 
ing trials E produced nine Stanford-Binet blocks, 
arranging seven of them, equally spaced, before 
S on a small table; the eighth was handed to S 
and E£ retained the ninth block. The £ then 
gave the following instructions: 

“TI am going to tap out some patterns on these 
blocks I have on the table before you and I 
expect you to tap out the same pattern. So that 
you can correct your errors, I will make a check 
mark on the sheet when you successfully com- 
plete the item and a circle when you fail. These 
first two will be examples to familiarize you with 
the task and will not be scored. Are you 
ready?” 

Although all Ss were given the same direc- 
tions, half of each group was given a neutral 
task and the other half an anxiety-producing or 
ego-threatening task. The Ss were assigned 
randomly to the task conditions. The Ss doing 
the neutral task were given simple, easy patterns 
(e.g., 15432, 17345, 23461) and were encouraged. 
They were able to tap the patterns accurately, 
and they could see a record of successful per- 
formance since E handled the recording so that 
it could easily be observed. The Ss doing the 
anxiety-producing task were given difficult or 
impossible patterns to follow (e.g., 137214623, 
754621735) and were given evidence of their 
failure. Verbal criticism by E was minimal 
since the task was designed to be upsetting, but 
occasionally he remarked that persons of college 
ability should be able to do much better on this 


TABLE 1 


Comparison or Groups E anp C on Distursinc anp Neutrat Worps 
SELECTED FROM THE Worp Association TEST 

















Disturbing Words Neutral Words 
Criterion Group E Group C Group E Group C 

Mean SD Mean SD Mean SD Mean SD 
Original RT in seconds 4.53 | 2.10 1.26 | .21 1.19 | .21 1.23 .22 
Reproduction RT in seconds 4.09 | 2.33 1.18 16 1.18 .24 1.18 21 
No. correct reproductions 2.10 | 1.56 | 10.00 .00 | 10.00 | .00 | 10.00 .00 
No. unusual original responses* §.20 | 1.91 .80 90 40 73 35 61 
No. usual original responsest 1.08 88 5.70 | 1.76 7.00 | 1.88 6.92 | 1.71 





























* Response occurring no more than once among 80 Ss. 


t Response occurring at least 10 times among 80 Ss. 


















































RECALL 169 
TABLE 2 
Mean Numser or Worps Recatitep sy Groups E anp C unDER 
tHE InrT1at Tasx ConpiTions 
Disturbing Words Neutral Words 
Group Task A Task NA Task A Task NA 
Mean SD Mean SD Mean SD Mean SD 
Recall I 
E 6.45 1.69 6.25 1.44 6.90 1.20 7.55 1.24 
6.45 1.36 7.65 91 6.80 1.21 7.55 1.69 
Both 6.45 1.54 6.95 1.39 6.85 1.19 7.55 1.48 
Recall II 
E 7.20 1.40 6.25 1.58 7.10 1.51 7.25 1.41 
Cc 7.00 1.41 7.30 1.45 7.50 1.32 7.55 1.66 
Both 7.10 1.41 6.78 1.60 7.30 1.44 7.40 1.55, 





























type of problem. Three easy patterns were 
given during the 15-min. period to maintain the 
credibility of the situation. This latter task was 
comparable to Zeller’s repression task. 

After 15 min. the block-tapping task was dis- 
continued and £ asked S to recall as many of 
the words that he had previously learned as 
possible. Two minutes were allowed for the 
recall period. To hold cues to a minimum this 
was a verbal recall with no mention or use of 
the nonsense syllables or drum. If S started to 
give nonsense syllables, E reiterated his request 
for the recall of the words. 

This first recall period was followed by a 
second interpolated task. Here the neutral 
block-tapping task was administered to all Ss 
to ascertain if the effects of the anxiety created 
by the failure experience could be dissipated by 
a success experience. Differing instructions 
were required in view of the different sets created 
by the initial interpolated task. Those who had 
experienced failure received the following in- 
structions: “We are going to do some more 
block-tapping problems. I believe you can do 
better and feel that you should have another 
opportunity. Are you ready?” The Ss who 
had experienced the easy patterns did the same 
task as before. The instructions were as follows: 
“We are going to do some more block-tapping 
problems. Are you ready?” Again the score 
sheet showed S that he was getting almost all the 
items correct. 

These tapping problems were stopped after 
15 min. and Ss were again asked to recall the 
words learned previously. After the 2 min. al- 
lowed for this recall, Ss were paid for participat- 
ing in the experiment and dismissed. 


REsuULTs AND Discussion 


Table 2 presents the basic data for 
the mean number of words recalled 
by Groups E and C on Recalls I and 
II. It will be remembered that half 
of the Ss were given an anxiety-pro- 
ducing failure task (Task A) and half 
were given a non-anxiety-producing 
task (Task NA) prior to Recall I. 
All Ss were given the non-anxiety- 
producing task prior to Recall II. 
The data in Table 2 are arranged rela- 
tive to the nature of the initial inter- 
polated tasks. 

Prior to performing an over-all 
analysis of variance, Bartlett’s test 
for homogeneity of variance was 
applied to the total words recalled by 
Ss in the four independent groups. 
The results indicated that the samples 
were not heterogeneous in variance. 
The uncorrected x? was .855 which 
for df = 3 yields a p of approximately 
.84 


The results of the over-all analysis 
of variance are given in Table 3. 
None of the values of F based on the 
noncorrelated data was significant; 
only the F for groups approached the 
For this portion of the 


5% point. 








170 

















TABLE 3 
Summary or ANALYSIS OF VARIANCE 
Source of Variation | df | Sfcan. F 

Between groups 1 |10.1531 | 3.2785* 
Between initial tasks 1 | 4.7531 | 1.5348 
Groups X Initial tasks | 1 | 8.7782 | 2.8345 
Between Ss of same a | con 

grou 0969 
Total Damnee Ss 79 
Between words 1 |16.6531 | 9.2972** 
Between recalls 1 | 3.0031 | 1.6766 
Words X Recalls 1 .1532 
Groups X Words 1 | 3.4050 | 1.9010 
Groups X Recalls 1 | .0800 
Initial tasks K Words 1 | 1.9550 | 1.0914 
Initial tasks X Recalls 1 |10.1550 | 5.6694* 
Groups X Initial tasks 

xX Words 1 | 8.7762 | 4.8996* 
Groups X Initial tasks 

X Recalls 1 1512 
Groups X Words 

X Recalls 1 | 2.2780 | 1.2718 
Groups X Initial tasks 

xX Words X Recalls 1 0015 
Initial tasks X Words 

X Recalls 1 2512 
Pooled Ss of same 

group X Columns {228 | 1.7912 
Total within Ss 240 
Total 319 














* Significant at the 5% level. 
** Significant at the 1% level. 


analysis the mean square based upon 
the variation between Ss in the same 
group was used to test the significance 
of the differences. 

Three significant F values were 
obtained from the analysis of the 
correlated data. Here the residual 
interaction (pooled R X C for Ss in 
the same groups) was used to test the 
significance of the findings. The F 
between words was significant beyond 
the 1% level; hence fewer D than N 
words were recalled. As may be seen 
in Table 2, this pattern occurs in six 
of the eight groupings of data. In 
individual cell comparisons only two 
differences between D and N words 
were significant at the 5% level; ?¢’s 
between D and N words recalled were 
2.77 for Group E under NA task condi- 
tions on Recall I and 2.17 for the same 
subgroup on Recall II. The inter- 


_ disturbing words. 





REED M. MERRILL 


action between words and groups was 
not significant, however. 

A possible explanation for this 
difference is that words selected as 
disturbing words on the basis of the 
usual criteria for word association 
tests may not be indicators of emo- 
tional disturbance but may merely 
reflect that there is a less stereotyped, 
less dominant response available to S. 
An analysis of the original responses to 
the words confirmed the observation 


that more stereotyped associations 


were given to the neutral than to the 
Words used only 
as D words produced an average of 
30.29 (SD = 9.20) associations ; words 
used as D words for some Ss and as N 
words for other Ss, a mean of 22.85 
(SD = 7.42) associations; words used 
only as N words, a mean of 20.85 
(SD = 6.55) associations. The differ- 
ence between the mean number of 
associations for the words used only as 
N or only as D words was significant 
at the 1% level (t = 4.94), as was 
the difference between words used 
only as N and those used sometimes as 
N and sometimes as D words for 
different Ss (¢ = 4.18). If this hy- 
pothesis is correct, one would expect 
that D words would also be more diffi- 
cult to learn than N words. Since 
the experiment was set up to equate 
level of learning, the data are limited 
in this area. The number of trials 
required to reach the first correct 
anticipation of each word was used as 
a crude measure of difficulty level. 
Group E required 13.76 (SD = 5.58) 
trials for D words and 13.28 (SD 
= 6.08) trials for N words. Group C 
required 12.12 (SD = 4.86) trials for 
D words and 12.24 (SD = 4.75) trials 
for N words. None of these differ- 
ences, either those between or those 
within groups, is significant. Simi- 


larly, there is no evidence to indicate 
that the learning task itself was more 
difficult for the group for whom half 
the words learned were words selected 











worrpeenmwrt@eww. Oe on ee tt 


-. 








RECALL 171 


as reflecting emotional disturbance. 
The mean number of trials required to 
meet the criterion of learning was 
38.60 (SD = 10.31) for Group E and 
34.58 (SD = 11.53) for Group C; 
this difference, though in the expected 
direction, was not significant (t= 1.62) 
It is possible that differences in diffi- 
culty would have been obtained had 
the criterion for learning been more 
stringent. : 

The F value for the triple inter- 
action between groups and initial 
tasks and words was significant be- 
yond the 5% level. Hence the differ- 
ence that was found between D and N 
words was related both to the nature 
of the initial interpolated task and to 
the groups; the latter variable reflects, 
it will be recalled, the difference in Ss’ 
reactions to the D words. When the 
data from Recalls I and II are com- 
bined, both groups, but particularly 
Group E, recall fewer D than N words. 
This difference is much more evident 
under NA than under A conditions. 
There is least difference between D 
and N words for Group C under NA 
conditions (Mean for D words =7.48; 
Mean for N words = 7.55); however, 
there is most difference between D 
and N words for Group E under these 
conditions (Mean for D words = 6.25; 
Mean for N words = 7.40). Group 
E under NA conditions did, as ex- 
pected, recall significantly fewer D 
words than Group C (Recall I: 
t = 3.58, p = .01; Recall IT: t=2.13, 
p = .05) and did not recall fewer N 
words (Recall I: ¢ = .26; Recall II: 
t = .87). However, Group E under 
Task A conditions did not recall fewer 
D words than Group C, and did not 
recall fewer D words than Group E 
under NA conditions. The inter- 
actions between words and groups, 
between initial tasks and words, and 
between initial tasks and groups were 
all insignificant. 

The interaction between initial 
tasks and recalls was also significant 


beyond the 5% level. Hence the 
effect of the initial interpolated task 
was dependent upon the immediacy 
of the recall. The interaction effect 
was due to the inefficiency of recall 
immediately following Task A; the 
interpolation prior to Recall II of a 
non-anxiety-producing task abol- 
ished this effect. On Recall I the 
mean number of words recalled after 
Task A was 6.65; on Recall II, 7.20. 
The mean number of words recalled 
after initial Task NA was 7.25 on 
Recall I, 7.09 on Recall II. Per- 
formance improved on Recall II for 
initial Task A Ss; the increase was 
significant for both D and N words 
(D words: t¢=3.6l1, p= .01; N 
words: t=2.04, p= .05). Per- 
formance showed little change for 
initial Task NA Ss for either D or N 
words; #’s for D and N words on 
Recalls I and II were .77 and .83, 
respectively. Hence there is no in- 
teraction between words and recalls. 


On the basis of these findings, one 
can draw three conclusions: 


a. Words selected as disturbing for 
Group E Ss were recalled less effectively 
by Groups E and C even though there 
was no clear evidence that the D words 
were more difficult to learn than the 
neutrally toned words. This raises a 
question about the validity of the usual 
clinical indicators of disturbance on 
word association tests. However, be- 
cause of the design of this experiment, 
many of the most disturbing words were 
eliminated. It is possible that when 
using less comparable groups, the word 
association test as used here would prove 
useful as an indicator of pre-experi- 
mental partial repression. But, in gen- 
eral, the findings do not support the 
generality given the repression concept 
by Freudian theorists, at least as meas- 
ured by word association tests. 

4. The effect of the D and N words 
was significantly related to the nature 
of the initial interpolated task and of 
the groups. Fewer D words were re- 
called by Group E than by Group C 
under Task NA conditions but not under 





172 


Task A conditions. There was no evi- 
dence of an additive decrement in recall 
efficiency when the effects of D words and 
Task A conditions were combined. It 
is possible that this lack of additive effect 
is related to the mildness of the anxiety 
related both to D words and Task A. 
However, on the basis of clinical im- 
pressions, it is quite unlikely that there 
is a straight line relationship between 
efficiency of recall and anxiety level. 
Since the normal organism has learned 
various behaviors for dealing with anx- 
iety-producing situations, it is more 
likely that there is a step-wise relation- 
ship between degree of anxiety and recall. 

c. The efficiency of recall was signi- 
ficantly related to the nature of the 
initial interpolated task. Recall I was 
less good after initial Task A than after 
initial Task NA. This difference dis- 
appeared after the interpolation of the 
second non-anxiety-producing task prior 
to Recall II. The ease with which the 
task-anxiety effects were reduced raises 
serious question as to the appropriateness 
of considering such experimentally in- 
duced effects as analogous to clinical 
repression. It is concluded that the 
appropriate design for studying clinical 
repression, if such a concept has meaning, 
has not yet been devised. 


SUMMARY 


The present study was made to determine the 
effect of two variables, words with association 
disturbances as measured by the word associa- 
tion method and an anxiety-producing inter- 
polated task, on recall efficiency. 

Eighty college males were given a 168-word 
association test. Ten words with association 
disturbances and ten words without association 
disturbances were selected for each S of Group E; 
all of these 20 words had to show no association 
disturbance for an S of Group C. These words 
were paired with nonsense syllables and learned 
to one correct anticipation trial. Following the 
learning half of the Ss in each group received an 
anxiety-producing task, half a non-anxiety- 
producing task, before Recall I. Then all Ss 
received a non-anxiety-producing task which was 
followed by Recall II. 

The findings were: (a) Words selected as dis- 
turbing for Ss of Group E were recalled less 
effectively by both Groups E and C; (6) the 
effect of the disturbing and neutral words was 
significantly related to the nature of the initial 
interpolated task and of the groups; and (¢) the 





REED M. MERRILL 


efficiency of recall was significantly related to 
the nature of the initial interpolated task. The 
data did not support any additive effect of 
disturbing words and the anxiety-producing task 
on retention in this experimental setting. The 
results question Freudian repression theory as an 
explanatory concept for association disturbances 
as measured by the word association method. 


REFERENCES 


1. Farper, I. E. Response fixation under 
anxiety and non-anxiety conditions. /. 
exp. Psychol., 1948, 38, 111-131. 

2. Freup, S. Collected papers. Vol. 4. Lon- 
don: Hogarth, 1925. 

3. Freup,S. New introductory lectures in psy- 
choanalysis. New York: Norton, 1933. 

4. Grtpert, G. M. The new status of experi- 
mental psychology on the relationship of 
feeling to memory. Psychol. Bull., 1938, 
35, 26-35. 

5. Hutt, C. L. The meaningfulness of 320 
selected nonsense syllables. Amer. J. 
Psychol., 1933, 45, 730-734. 

6. Hutt, C. L. Principles of behavior. New 
York: D. Appleton-Century, 1943. 

7. Junc, C. G. Studies in word association. 
New York: Moffat, Yard, 1919. 

8. Larrat, J. The learning and retention of 
words with association disturbances. /. 
abnorm. soc. Psychol., 1952, 47, 454-462. 

9. Montacue, E. K. The role of anxiety in 
serial rote learning. J. exp. Psychol., 
1953, 45, 91-95. 

10. Raparort,D. Emotions and memory. Bal- 
timore: Williams & Wilkins, 1942. 

11. Rapaport, D., Girt, M., & Scuarer, R. 
Diagnostic psychological testing. Vol. 2. 
Chicago: Yearbook Publishers, 1946. 

12. Rosanorr, A. J. Manual of psychiatry. 
New York: Wiley, 1938. 

13. Sears, R.R. Survey of objective studies of 
psychoanalytic concepts. Soc. Sci. Res. 
Coun. Bull., 1943, No. 51. 

14. Spence, K.W. Theoretical interpretations 
of learning. In F. A. Moss (Ed.), Com- 
parative psychology. (Rev. Ed.) New 
York: Prentice-Hall, 1942. 

15. Spence, K.W. Theoretical interpretations 
of learning. In S. S. Stevens (Ed.), 
Handbook of experimental psychology. 
New York: Wiley, 1951. Pp. 690-729. 

16. Zetter, A. F. An experimental analogue 
of repression. I. Historical summary. 
Psychol. Bull., 1950, 47, 39-51. 

17. Zerter, A. F. An experimental analogue of 
repression. II. The effect of individual 
failure and success on memory measured 
by relearning. J. exp. Psychol., 1950, 40, 
411-422. 


(Received August 21, 1953) 











_— = ww 





Vol 48, No 8, 198 ee 





REINFORCEMENT SCHEDULES IN HABIT REVERSAL— 
A CONFIRMATION 


JOSEPH H. GROSSLIGHT, JOHN F. HALL, AND WINFIELD SCOTT 
The Pennsyloania State University 


It is an established principle that a 
regimen of partial reinforcement dur- 
ing a learning series leads to greater 
resistance to extinction ,than does 
continuous reinforcement. Recently, 
Wike (2) has extended the operation 
of this principle into the area of re- 
learning or habit reversal. Specifi- 
cally, he found that “when a response 
is acquired under a partial reinforce- 
ment regimen, it will resist extinction 
longer than a response conditioned by 
continuous reinforcement whether ex- 
tinction is assessed in the customary 
fashion or by a retraining test” (2, 
p. 260). The present study, per- 
formed without knowledge of the 
Wike experiment, essentially dupli- 
cates his conditions and thus provides 
confirmation of his findings. 

In addition to the empirical con- 
firmation of the Wike study, attention 
should be called to the possibility that 
the retraining or habit-reversal situa- 
tion may be a more sensitive and 
appropriate procedure for comparing 
response strengths acquired under 
partial or continuous reinforcement 
schedules than conventional extinc- 
tion procedures. The effects of cer- 
tain variables such as frustration 
drives or temporary inhibition factors, 
which play an important but con- 
founding role in the traditional ex- 
tinction procedure, are mitigated in 
the habit-reversal situation. 


MetTHOD 


Subjects —Twenty-eight Sprague-Dawley al- 
bino rats, weighing 175 to 200 gm. at the be- 
ginning of the experiment were used. Approxi- 
mately one week before the experiment, Ss were 
placed on a 23-hr. deprivation—l-hr. feeding 


schedule, and they remained on this schedule 
throughout the experiment. 

Apparatus—A Y-alley discrimination ap- 
paratus painted flat gray was used. By moving 
the Y alley, and thereby employing two of the 
three goal boxes, it was possible to randomize the 
position of the stimulus card. The correct card 
was allowed to swing freely, thus permitting S 
to gain entrance into the goal box and obtain 
the food reward, while the card denoting the in- 
correct goal box could be made fast. 

Preliminary training.—Four days of prelimi- 
nary training were given. The first two days 
consisted of two trials each with no cards on the 
goal boxes. The first trial for each day was a 
free choice, while the second was a forced choice 
to the side opposite that of the previous trial. 
Each trial was rewarded with a small pellet of 
food. The last two days of training were similar 
in procedure, except that gray cards were now 
placed in the entrances of the goal boxes and four 
trials per day were given. 

On the basis of the mean running time of the 
last two trials on the last day of preliminary 
training, two equated groups were established 
with mean running times of 5.84 and 5.94 sec. 
and SD’s of 4.33 and 4.61, respectively. 

Discrimination training—With the white 
card as the positive stimulus, Ss from both 
groups were given six trials per day for 14 days. 
On these six trials, the partially reinforced Ss 
received four reinforcements and two nonrein- 
forcements. These nonreinforcements were dis- 
tributed differently in: each day’s training, with 
the requirement that no day’s training begin or 
end with nonreinforcement. On _ reinforced 
trials, S received a small pellet of food; on non- 
reinforced trials, S was confined to the goal box 
for 30 sec. without food. 

In this phase of the training, the black card 
was made fast so that S always terminated a 
trial by entering the white (or positive) goal box, 
in which reinforcement or nonreinforcement was 
received. In all phases, the cards were alter- 
nated with respect to position in a prearranged 
fashion for each day’s trials, with only the 
restriction that the black and white cards appear 
an equal number of times in the right or left 
positions, respectively. 

Since a correction procedure was employed, 
an error was judged to have occurred when S’s 
hind feet had both crossed a line 7 in. from the 


173 








174 











' 2 4 ’ . s o 


. 6 
mocks OF SK TRALS 


Fic. 1. Mean number of black responses 


in the habit reversal phase 


black goal-box entrance. The basic data of this 
portion of the experiment are the number of 
errorless choices of the white stimulus card for 
each group. 

Habit reversal training.—In this phase, the 
black card became the positive stimulus card. 
Both doors were unlocked so that S could enter 
either goal box. However, because reward could 
be obtained only in the black goal box, no pat- 
tern of reinforcement was maintained. Each S 
was given six trials a day for ten days. The 
basic data of this portion of the experiment are 
the number of choices of the black (previously 
negative) goal box. 


REsULTs AND DiscussION 


Although each group began with 14 
Ss, a number of Ss had to be eliminated 
in the discrimination-training phase be- 
cause of failure to perform. In order 
that the groups might be equated for 
the habit-reversal phase, additional Ss 
were eliminated. This procedure even- 
tuated in an N of 10 for the continuous 
reinforcement group and 12 for the 
partial reinforcement group. The mean 
number of correct responses for the six 
trials on the last day of the original dis- 
crimination training was 5.10 for the 
continuous reinforcement group, and 


5.42 for the partial reinforcement group. 
The mean number of correct reversal 
responses (running into the black goal 





JOSEPH H. GROSSLIGHT, JOHN F. HALL, AND WINFIELD SCOTT 


box) per day for the ten-day period for 
the continuously reinforced group was 
3.47; in contrast, the mean number of 
reversal responses for the partially 
reinforced group was 2.32. Analysis 
of covariance indicates that this dif- 
ference between the continuous and 
partial reinforcement groups in the 
reversal training is significant at the 
.01 level of confidence (F=16.24 for 1 
and 19 df). The greater difficulty ex- 
perienced by the partially reinforced 
group in effecting the reversal is even 
more clearly demonstrated in a day-by- 
day analysis (Fig. 1). 

The results clearly confirm Wike’s con- 
clusions that a partial reinforcement pro- 
cedure tends to increase the persistence 
of the original response when an alterna- 
tive response is required for reinforce- 
ment, thereby delaying reversal. 


SUMMARY 


The purpose of this experiment was to investi- 
gate partial versus continuous reinforcement in 
a habit-reversal situation. Two groups of rats, 
one given partial reinforcement and the other 
continuous reinforcement, were trained to make 
a simple white-black discrimination. In this 
phase, the white choice was positive and the 
black negative, with a correction procedure em- 
ployed. After 14 days of six trials per day, 
reversal training was given with the black now 
positive. The results indicated that the con- 
tinuous reinforcement group effected the reversal 
significantly more rapidly than the partial 
reinforcement group. 


REFERENCES 


1. Jenxins, W. O., & Stantey, J. C. Partial 
reinforcement: a review and critique. 
Psychol. Bull., 1950, 47, 193-234. 

2. Wixe, E. L. Extinction of a partially and 
continuously reinforced response with 
and without a rewarded alternative. /. 
exp. Psychol., 1953, 46, 255-260. 


(Received November 27, 1953) 




















Va. 48, No S198 ee? 


A TEST OF WHETHER THE “NONREWARDED” ANIMALS 
LEARNED AS MUCH AS THE “REWARDED” ANIMALS 
IN THE CALIFORNIA LATENT LEARNING STUDY! 
JOSEPH H. KANNER? 

New York University 


Tolman, in his analysis of the Cali- 
fornia latent learning studies (1, 18), 
concluded that: 

..in these experiments, the final 
amount of learning of a control group which 
had had strong reward throughout was no 
greater than that of the experimental group 
which received practically no reward 
throughout most of the learning period (16, 
p. 343, italics mine). 

Kendler (8) has pointed out that this 
conclusion seemed somewhat pre- 
mature if one accepted one of the 
major conclusions of the latent learn- 
ing studies, namely, that maze per- 
formance does not always mirror maze 
learning. Implied in this principle of 
a division between learning and per- 
formance is the possibility that equal 
performance need not imply equal 
learning, i.e., under certain conditions, 
different amounts or types of learning 
can produce the same level of per- 
formance. Perhaps a more sensitive 
measure of the amount of performance 
in the California latent learning situa- 
tion would reveal differences in the 
performances of the experimental and 
control groups. 

An obvious experimental design 
which would provide a more sensitive 
measure would be one involving some 
activity interpolated between latent 
learning trials and test trials. The 

1 This paper is a portion of a dissertation sub- 
mitted to the Department of Psychology of 
New York University in partial fulfillment of 
the requirements for the degree of Doctor of 
Philosophy. The-writer is indebted to Professor 
Howard H. Kendler for his advice and assistance 
throughout the investigation. 


?Now at the Human Resources Research 
Office, Washington, D. C. 


present paper reports the results of 
such a study. In order to execute 
this study, however, it was necessary 
to reproduce latent learning phenom- 
ena. The first part of this paper 
will report the results of such pre- 
liminary experiments, while the sec- 
ond part will report the results of the 
main study. 


PRELIMINARY EXPERIMENTS 
Experiment I 


Subjects —The Ss were 20 naive male albino 
rats (approximately 60 days old) of the Wistar 
strain, purchased, as were all other Ss in the 
studies presented in this paper, from the same 
stock, at the Albino Farms, Red Bank, New 
Jersey. 

Apparatus.—The ground plan of the 14-unit 
maze is shown in Fig. 1. This maze was 
patterned after the description contained in the 
Tolman and Honzik articles (17, 18). In at- 
tempting to duplicate the maze and the essential 
features of the experimental situation employed 
by Tolman and Honzik, modifications were 
introduced, some through lack of certain infor- 
mation, others in accordance with practices used 
in the Animal Behavior Laboratory at New York 
University. 

The swinging doors or “valves” (13) originally 
employed by Tolman and Honzik were replaced 
by guillotine doors to prevent retracing in the 
present maze. These doors consisted of a sliding 
metal plate suspended between wooden uprights, 
and were manipulated by means of a string and 
pulley arrangement. Since the wooden uprights 
for these guillotine doors might serve as a visual 
cue for the correct pathways, nonfunctional 
“dummy” doors, which were raised, were in- 
stalled in each of the blinds, duplicating the other 
guillotine doors in dimensions and appearance. 
The entire maze was painted a flat black. 

The maze was placed on the floor of a room 
measuring 25 ft. long and 18 ft. wide. About 
1 ft. in front of the maze, and running the width 
of the maze, was a plywood screen, 66 in. high 
and painted flat black. The E stood behind 


175 


176 















































_ 7 | 
yo ™ « 
-ekelir 
l 








yp SCREEN~. 





Fic. 1. Ground plan of 14-unit maze 


this screen and manipulated the guillotine doors. 
Since the rear portion of the maze lay approxi- 
mately 14 ft. from E and the plywood screen, it 
was difficult to observe the performance of Ss 
when they approached this portion of the maze. 
A mirror permitted adequate observation of Ss’ 
performance in the outermost portions of the 
maze. A 150-w. unfrosted bulb was suspended 
7 ft. above the center of the maze. The experi- 
mental room was completely separated from the 
living cage area by a large plywood screen. 

Two end boxes, measuring 15 X 6 in., were 
used. One of these boxes was always used 
during the food-rewarded trials, while the other 
was used during the nonfood trials and never 
contained food. These boxes were of the same 
material and construction as the maze proper, 
and were opened by means of a hinged top 
covered by ’.5-in. mesh hardware cloth. Food 
was placed in a shallow glass container located 
at the center of the back wall of the end box. 

Preliminary training—For a period of two 
weeks prior to running in the maze, Ss were 
handled from 2 to 3 min. daily. The hunger 
rhythm pattern was also initiated during this 
period, with Ss being given one pellet of Purina 
Dog Chow, weighing approximately 7 gm., at 
the time of the day at which they were to be fed 
during the experiment. The straightaway train- 
ing employed by Tolman and Honzik was elimi- 
nated in order to reduce any possible secondary 
reinforcement generalizing to the end boxes of 
the 14-unit maze. i 

Training series—The training consisted, for 
all Ss, of one daily trial for a period of 21 days. 





JOSEPH H. KANNER 


The Ss were divided into two groups of ten each, 
with Ss of both groups being motivated by a 
hunger drive based upon 22 hr. of food depriva- 
tion. The Ss in Group F found food in the end 
box at the end of each trial. This food consisted 
of a mash prepared by soaking Purina Dog Chow 
pellets in water. When an S had fed in the end 
box for 2 min., it was removed from the end 
box and placed in the living cage. Here it was 
given its daily ration of one Purina Dog Chow 
pellet. 

The Ss of Group NF(10) did not find food in 
the end box for the first ten trials. When an S 
of this group reached the empty end box, it was 
kept there for 2 min. At the end of this time 
interval, it was removed to a carrying cage out- 
side the experimental and living cage rooms and 
kept there for a period of 2 hr. It was then 
returned to the living cage where it was fed 1 hr. 
later. Prior to the beginning of Trial 11, and 
on all succeeding trials, food was placed in the 
end box. During and after Trial 11, Ss were 
treated as were Ss of Group F. 

Scoring procedure——The scoring procedure 
used throughout the present investigation was 
that described by Tolman and Honzik (18). In 
this procedure, an error was scored when S made 
an entrance into a blind the full body length, not 
including the tail. No more than one error was 
counted for entering any given blind. The scor- 
ing procedure suggested by Reynolds (15), which 
provided for the recording of a partial entrance 
into the blinds, was attempted but was found to 
be too difficult for one E. 


Results —The mean number of er- 
rors per trial is graphically presented 
in Fig. 2. Evidence of latent learning 
would -be a significant difference in 
performance on Trial 11 and no signi- 
ficant difference on Trial 12. Exam- 
ination of this figure reveals little 
difference between the performance of 
the two groups during the “latent 
learning”’ period or the series of trials 
that followed the introduction of food 
for Group NF(10). None of the 
differences between the two groups, on 
any given trial, approached statistical 
significance. 

The results of Exp. I indicated that 
the hope of duplicating the Tolman- 
Honzik latent learning phenomenon 
depended on the ability to “slow up” 
the performance of Group NF(10). 
It was therefore decided to modify 











- 


i ee 








LATENT LEARNING STUDY 177 


MEAN ERRORS 
eo + @ @ 


~~ 








GROUPS 
o——e F 
o@----0 NF(IO) 


ec 





°o 


wwe eh 
TRIALS 


iO 2 13 4 8 6 7 18 19 20 21 


Fic. 2. Mean number of errors per trial made by Groups F and NF(10) 
for the 21-trial series in Exp. I 


that part of our experimental pro- 
cedure which might be related to the 
rapid learning of this group. 


Experiment II 


Experimental design.—The experi- 
mental procedure of the Tolman- 
Honzik study (18) was re-examined 
and compared with that of Exp. I in 
an attempt to isolate any important 
difference which might account for the 
faster performance of the no-food 
group during the latent learning 
period. 

One possibility was that the large 
mirror used by £ in Exp. I in some 
manner facilitated the learning of the 
maze pattern by Group NF(10). The 
mirror was removed and a pilot group 
of five Ss was given ten trials in the 
maze without food reward. Since 
their performance was similar to the 
original Group NF(10), this possi- 
bility was eliminated from considera- 
tion. 

Two other possibilities were con- 
sidered. The first of these was sug- 
gested by Karn and Porter’s (7) 
evidence that removal from the end 


box was a source of reward. No 
information was available as to the 
exact time interval used by Tolman 
and Honzik. It was decided to 
lengthen the detention period of 
Group NF(10) in the end box from 
2 to 3 min. on the assumption that 
delaying the rewarding effect of re- 
moval from the maze might slow 
down Ss’ performance. In order to 
equate for the time spent in the end 
boxes, it was decided to lengthen the 
feeding time from 2 to 3 min. for 
Group F. 

The second possibility concerned 
the type of door used. Tolman and 
Honzik used swinging doors as com- 
pared with the guillotine doors used 
in Exp. I. It was decided, therefore, 
to revert to the original procedure. 

The important modifications in- 
troduced in Exp. II were the lengthen- 
ing of the detention and feeding per- 
iods in the end box to 3 min. and the 
substitution of swinging doors for 
guillotine doors. 


Subjects —The Ss were 20 naive male albino 
rats approximately 60 days of age. 





178 


Apparatus.—The maze used in Exp. I was 
modified by the aforementioned substitution of 
swinging doors for the guillotine-type doors. 
These doors were made of sheet metal, painted 
flat black, suspended between the maze walls, 
with a }-in. gap between the bottom of the door 
and the floor of the maze. These swinging 
doors were placed in the “correct” pathways 
only, as in the original Tolman-Honzik maze. 
Similarly, as in the Tolman-Honzik maze, cur- 
tains were placed in both the “correct” pathways 
and in the blinds to prevent S from using the 
presence of a swinging door as a cue for the 
“correct” pathway. In passing one of these 
doors, S had to push it forward with its nose, 
and when S ran forward, the door would lift and 
brush over the length of its body. The gap 
between the door and the maze floor described 
above permitted the tail to pass without injury 
due to jamming. The wooden uprights previ- 
ously used with the guillotine doors served as 
frames for the swinging doors and permitted the 
door to open only in the forward direction, thus 
preventing retracing by S. 

Preliminary training.—The handling and 
feeding procedures were the same as those used 
for Exp.I. Although signs of emotionality were 
evident (defecation, urination), it was found that 
Ss were capable of manipulating the swinging 
doors without any previous training. Therefore, 
training in a straightaway, as was used in the 
Tolman-Honzik study (18), was not introduced. 

Training series.—As in Exp. I, Ss were divided 
into two groups (Groups NF[10] and F) and 
received essentially the same training except 












8} 
7h 
\ GROUPS 
| o——e F 
6 @---—-0 NF(IO) 
Es 
5 
4} R 
3 v | 
=5 \ 
\ 
2 s 
! 
o 12346567 SS ON TE es 
TRIALS 


Fic. 3. Mean number of errors per trial made 
by Groups F and NF(10) during the 15-trial 
series in Exp. II 





JOSEPH H. KANNER 


that Group F was permitted to feed in the end 
box for 3 min. and Group NF(10) was detained 
for an equivalent period. 


Results —The mean error scores for 
the two groups are graphically pre- 
sented in Fig. 3. As can be seen, the 
modification of experimental proce- 
dure retarded the performance of 
Group NF(10) during the latent 
learning period. However, the mean 
error score for this group dropped 
from 4.2 errors on Trial 10 to 1.9 
errors on Trial 11, which was the trial 
on which food was initially introduced. 
Since the mean error scores of the two 
groups were so similar (.9 and 1.9) on 
Trial 11, it was impossible for any 
latent learning to be demonstrated. 


Experiment IIT 


Experimental design.—Since the 
major aim of the preliminary experi- 
ment was to arrive at an experimental 
technique capable of producing latent 
learning, it was decided that the most 
practical procedure would be to intro- 
duce the food into the maze earlier 
than Trial 11. The fourth trial was 
selected for the introduction of food, 
thus minimizing the possibility that 
the performance curve of Group 
NF(10) might “catch up” to that of 
Group F prior to the introduction of 
food reward. In addition to this 
modification in procedure, it was de- 
cided to increase the detention and 
feeding periods to 4 min. The results 
of Exp. II indicated that lengthening 
the detention and feeding periods had 
a tendency to increase the differences 
between the performance of the two 
groups during most of the latent 
learning period. 

Subjects —The Ss were 20 naive male albino 
rats, approximately 60 days old. 

Apparatus.—The 14-unit maze, with swinging 
doors, as described in Exp. II, was used. 

Preliminary training.—The handling and 


feeding procedures were identical with those in 
Exp. I and II. 











ee oe | 


— =~ 


eo m/s sw OF LY 


ee Vw OO ee wY @ fe 








LATENT LEARNING STUDY 179 


Training series —The Ss were divided into 
two groups of ten each. The Ss of each group 
received one trial daily for 11 days. The Ss of 
Group NF(3) were not given food until Trial 4. 
During this “latent learning” period, these Ss 
were detained in the empty end box for a period 
of 4 min. Following the fourth trial, they re- 
ceived the same treatment as Ss of Group F. 
The Ss in Group F were permitted, on all trials, 
to feed in the end box for 4 min. 


Results —Figure 4 presents the 
mean error scores of Groups F and 
NF(3). As can be observed, the 
experimental procedure used provides 
very definite evidence of the latent 
learning phenomenon. The error 
score of Group NF (3) dropped to 1.8 
errors on Trial 5 from 6.9 errors on 
Trial 4. The equivalent data for 
Group F were 2.4 and 1.9 errors, 
respectively. The difference between 
the mean errors of Groups F and 
NF (3) was significant beyond the 1% 
level on Trial 4 but not significant on 
Trial 5. 

A comparison of Fig. 3 and 4 
suggests that the extra minute of 
detention for Group NF(3) slowed up 
their performance somewhat during 
the latent learning period. The extra 
minute of opportunity for feeding, 
however, did not seem to have any 
effect on the performance of Group F. 
This can probably be explained by the 
fact that many Ss did not eat during 
the last minute of the 4-min. period. 


Main ExperRIMENTS 


Experimental design—The two 
main experiments involve the design 
in which a four-day latent learning 
period was followed by an interpo- 
lated period and then by the test 
trials. In Exp. IV, the interpolated 
activity consisted of rewarding blind 
alley entrances of the experimental 
maze, and in Exp. V it consisted of a 
30-day detention period during which 
time Ss were kept in their home cages. 

Each experiment involved three 


MEAN ERRORS 
be a 


ow 









\ 


fe) 123465678 9 
TRIALS 








Fic. 4. Mean number of errors per trial made 
by Groups F and NF(10) during the 11-trial 
series in Exp. III 


groups: Group 0 never received any 
food in the end box during the four- 
day training (latent learning) period; 
Group | received food on the last day 
of the training series; Group 4 was 
rewarded with food at the end of each 
daily trial. 


Subjects—The Ss for this experiment were 
60 naive male albino rats, approximately 60 
days old. 

Apparatus.—The 14-unit maze, described in 
Exp. III, was utilized. 

Preliminary training.—The preliminary train- 
ing was the same as that employed in Exp. I. 

Training series——Each S in all groups re- 
ceived one daily trial in the maze for four days. 
Group 0 was detained in an empty end box for 
4 min. During the initial three training trials, 
Ss of Group 1 did not receive food; on Trial 4 
they did. Food was present in the end box for 
Group 4, and Ss in this group were permitted 
to eat for 4 min. 

When the 4min. eating period for an S 
terminated, it was removed to its living cage 
where it received its daily ration. In order to 
avoid any indirect food reinforcement, Ss who 
were not given food in the end box were removed 
at the end of the detention period to a carrying 
cage outside both the experimental and living 
cage rooms, and kept there for 2 hr. Following 
this, they were removed to the living cages where 
they were fed 1 hr. later. 














or 
a t GROUPS 
‘ o------- - OFM 
7 \ o———o IFM 
‘ --—--- —s 4FM 
‘ 
Pe 6 > H 
a . », ‘ 
25 \ H 
a \ 5 | 
z| \ \ 
a 3 \ 
= » 
2 
' 
Oo T2e34 2345678 
TRAINING TRIALS TEST TRIALS 
Fic. 5. Mean number of errors per trial made 


by groups which had “maze feeding” as the inter- 


polated activity in Exp. IV 


Interpolated activity—The interpolated ac- 
tivity for all groups began on the day following 
the fourth training trial. The first experiment 
involved “negative training” in the experimental 
maze for two successive days. Each T section 
of the maze was blocked off so that S could not 
run from one unit of the maze to another as it 
had done during training. The correct path of 
each unit was blocked off so that Ss were forced 
to go into the blind alley. Initially, Ss were 
placed at the starting point of the maze, and 
when S entered the blind alley, it was permitted 
to eat for 30 sec. It was then removed from 
the blind and placed at the entrance of the 
second T section where it again fed in the blind. 
This procedure was repeated until S fed in all 
14 blinds. On the next day, the same procedure 
was repeated except that Ss were fed in the blinds 
in backward order. On both days it was noted 
that Ss’ actual feeding time dropped to 15 or 20 
sec. in the last three or four blinds. 

The three groups in this experiment possess 
both an identifying letter M, which indicates the 
interpolated activity used (maze feeding), and 
a numeral indicating the number of food-rein- 
forced trials during the training series. The 
groups are OM, 1M, and 4M. During this inter- 
polated training, Ss’ daily rations were reduced 
to one-half of a pellet of Purina Dog Chow, in 
order to insure their eating in the blinds. 

The second experiment involved the use of an 
interpolated detention sequence in, which Ss 
were kept in their home cages for a period of 30 
days. During this detention period, Ss received 








JOSEPH H. KANNER 


their usual regimen, and were never removed 
from their cages or handled in any manner. 
The groups in this experiment bear the identify- 
ing letter T (time), and numbers which refer to 
the number of food-reinforced trials during the 
training series. The groups are OT, IT, and 4T. 

Test series.—The test series began on the day 
following the last day of interpolated activity. 
The Ss of each group were given one trial a day 
for eight days in the experimental maze. Food 
was present in the end box of the maze during 
these eight trials. When S entered the end box, 
it was allowed to feed for 4 min. It was then 
removed to the living cage where it received its 
daily food ration of one Purina Dog Chow pellet. 
This procedure was identical to that used with 
the food-rewarded groups prior to the inter- 
polated activities. 


Results —Figures 5 and 6 graphi- 
cally represent the mean error scores 
for the three groups in Exp. IV and 
V, respectively. The performance of 
the groups that received food reward 
throughout the training series (Groups 
4M and 4T) was clearly superior to 
the other groups during the training 
series. 

The performance of the different 
groups during the test series is 
evaluated in two ways. First, a 
comparison of mean errors on the 


o 


> a 





MEAN ERRORS 
nm wi 














OT?t34 t2345678 
TRAINING TRIALS TEST TRIALS 

Fic. 6. Mean number of errors per trial made 
by groups which had 30-days detention as the 


interpolated activity in Exp. V 

















de 
he 





LATENT LEARNING STUDY 181 


TABLE 1 


Anatysis or VARIANCE oF Mean Error 
Scores on Seconp Test TRIAL 























Source e ifn) ® 
Days reward 2 19.35 | 5.74* 
Interpolated 

activities 1 0 
DR X IA 2 2.45 
Within groups 54 3.37 

Total 59 





* Significant at the .01 level. 


second test trial is made. This trial 
is selected since it follows the trial on 
which all groups received food reward 
in the end box for the first time; i.e., 
Groups 0M and OT have their initial 
experience with food in the end box on 
the first test trial. 

Table 2 presents the analysis of 
variance of the error scores on the 
second test trial for Exp. IV and V. 
The results indicate that the number 
of days of food reward during the 
training trials was a significant vari- 
able (beyond the 1% level) deter- 
mining the performance of Ss on the 
second test trial. On the other hand, 
the type of interpolated activity did 
not appear to have any effect, nor was 
there any evidence of an interaction 
effect. 

The results of the analysis of vari- 
ance therefore justify individual com- 
parisons among the various groups in 
Exp. IV and V. The results of the 
same comparisons within each experi- 


ment were combined by the chi-square 
technique of combining probabilities 
(9). These comparisons, presented in 
Table 2, indicate that the greater the 
number of food reinforcements during 
the training series, the better able Ss 
were to resist the effects of the inter- 
polated activities. 

In addition to comparing the per- 
formance of the various groups on the 
second test trial, an analysis of the 
error scores during the entire test 
series was made. As can be seen in 
Fig. 5 and 6, there appears to be no 
noticeable difference among the vari- 
ous groups during the test series in 
their ability to reach the criterion of 
running the maze without error. An 
analysis of the results in terms of 
mean number of trials to reach the 
criterion of zero errors failed to pro- 
duce a significant result. If, however, 
one compares the mean number of 
errors, not including errors on the first 
test trial, committed by the various 
groups before reaching the criterion of 
zero errors, a significant difference is 
obtained. Table 3 presents the re- 
sults of the analysis, while Table 4, 
which is similar in form to Table 3, 
presents the results of the individual 
comparisons. Again, the results in- 
dicate that (a) the number of food- 
reinforced trials is a pertinent variable 
in Ss’ performance during the test 
series, and (+) the greater the number 
of food reinforcements, the better able 














TABLE 2 

Comparison oF Mean Numser oF Errors Mape sy THE Groups tn Exp. IV anp V 
Exp. IV Exp. V -IVe&eV 

(Maze) (Time) FAD bined® 

Comparison 

Mean t pb Mean ‘ p x? ? 
Groups 4 & 0 2.5/3.9 1.70 08 1.9/4.2 3.04 01 16.5 O01 
Groups 4 & 1 2.5/2.6 12 90 1.9/3.1 1.70 08 5.5 .26 
Groups 1&0 | 2.6/3.9 1.58 12 3.1/4.2 1.34 15 7.7 .10 





























* The probabilities of the same comparison are combined by the chi-square technique. 


182 


TABLE 3 


ANna.ysis oF VARIANCE OF Errors 1n Exp. 
IV anv V To a CRITERION oF ZERO 
Errors purinc Test Series 











Source af ez. F 
Days reward 2 268.7 | 14.2* 
Interpolated 

activities 1 112.0 
DR X IA 2 1.5 
Within groups 54 18.8 

Total 59 














* Significant at the .001 level. 


Ss are to resist the effects of interpo- 
lated activities. 


Discussion 


The major aim of the preliminary 
experiments was to discover an experi- 
mental procedure capable of producing 
the latent learning phenomenon. In 
achieving this aim some hints as to the 
nature of the experimental variables 
determining the latent learning phenom- 
enon were provided. 


Table 5 summarizes the procedure used 
in seven studies designed to produce 
latent learning; three of them are the 
preliminary experiments presented in 
this paper. The most striking feature 
of this table is that latent learning has 
never been obtained when guillotine 
doors were used; i.e., swinging doors 
appear to be an essential variable in 
obtaining latent learning. 

The mere presence of swinging doors, 
however, does not appear to be a suffi- 
cient condition since Meehl and Mac- 
Corquodale (12) used swinging doors 
and failed to obtain positive evidence. 
It should be noted that in their study 


TABLE 4 


Comparison or Mean Numer or Errors 
MapeE sy THE Groups 1n Exp. IV anp 
V to a CriTERION oF ZERO Errors 














Comparison Mean t ? 
Groups4&0O | 4.1/11.4] 5.36 001 
Groups 4 & 1 4.1/7.2 2.21, 03 
Groups 1&0 | 7.2/11.4 | 3.22 01 














JOSEPH H. KANNER 


a relatively short period (1 min.) was 
used both to detain the no-food groups 
and feed the food groups. The results 
of these seven studies suggest, therefore, 
that the latent learning phenomenon is 
a function, to some extent, of swinging 
doors and “relatively long” periods in 
the end box. 

The main experiments of the present 
study were concerned with only one 
issue. They sought to discover whether 
the latent learning group in the Cali- 
fornia study (18) actually “learned as 
much” as did Ss who received food at 
the end of each test trial. If it did, then 
it would be expected that the perform- 
ance of the no-food and food groups 
would be similar following any common 
interpolated activity. The results of the 
main experiments were inconsistent with 
such an expectation. The data indicate 
that there was a positive relationship 
between the number of food-reinforced 
trials during the so-called “latent learn- 
ing” period and the performance of Ss 
in the experimental maze following 
interpolated activity. These results are 
obviously damaging to the theoretical 
structure developed by Tolman and his 
associates. They have continuously as- 
sumed that the no-food Ss in the Cali- 
fornia latent learning studies /earned 
as much about the “correct path” to the 
end box as did the food-rewarded Ss. 

It seems clear that more detailed in- 
formation is needed concerning the 
animals’ behavior in a multi-unit maze 
before more precise hypothesizing is 
possible. The present investigation has 
suggested some of the broader variables 
influencing multi-unit maze learning, 
but it is apparent that more experi- 
mentation utilizing more detailed ob- 
servation (i.e., mechanical or electrical 
recording) would provide data for filling 
in the important hiatuses in our knowl- 
edge of what the animal does in this 
type of learning situation. 


SUMMARY 


Three preliminary experiments are reported 
which describe attempts to reproduce the latent 
learning phenomenon reported in the California 
In Exp. I and II 


latent learning study (18). 

















aN Wwe 


th U Or Ow Dt 


ow @ 


| ell 
— 





LATENT LEARNING STUDY 183 

















TABLE 5 
Enp-Box Time Intervat anp Type or Door in Seven “Latent Learninc” Srupres 
End-Box Time Interval (Min.) 
“Latent 
Investigator Door ing” 
F Group NF Group Results 
(feeding) (detention) 
Blodgett (1) swinging 3 2 obtained 
Tolman & Honzik (18) swinging unknown unknown | obtained 
Reynolds (15) guillotine 3 2 not obtained 
Meehl & MacCorquodale (12) swinging 1 1 not obtained 
Kanner, Exp. I guillotine 2 2 not obtained 
Kanner, Exp. II swinging 3 3 not obtained 
Kanner, Exp. III swinging 4 4 obtained 























latent learning was not obtained because the 
no-food groups tended to eliminate the blind 
alley entrances during the “latent learning” 
period. The latent learning phenomenon was, 
however, duplicated in Exp. III in which food 
was initially introduced for the no-food Ss at the 
end of the fourth trial. These experiments sug- 
gest that the latent learning phenomenon is 
dependent upon the use of swinging doors and 
a relatively long detention period in the end box 
for the no-food Ss. 

The two main experiments were designed to 
discover whether the no-food Ss in the California 
latent learning study (18) learned as much as Ss 
that were rewarded with food in the end box at 
the conclusion of each trial. An experimental 
design was used in which an interpolated activity 
intervened between a training and a test series. 

In Exp. IV, the interpolated activity involved 
feeding Ss in the blind alleys, and in Exp. V a 
30-day detention period served as the inter- 
polated activity. In both experiments the re- 
sults indicate that a positive relationship exists 
between the amount of food-rewarded trials 
during training and resistance to the interfering 
effects of the interpolated activities. 


REFERENCES 


1. Bropcetr, H. C. The effect of the intro- 
duction of reward upon the maze per- 
formance of rats. Univ. Calif. Publ. 
Psychol., 1929, 4, 113-134. 

2. Farser, I. E. Response fixation under 
anxiety and non-anxiety conditions. /. 
exp. Psychol., 1948, 38, 111-131. 

3. Grice,G.R. An experimental study of the 
gradient of reinforcement in maze learn- 
ing. J. exp. Psychol., 1942, 30, 475-489. 

. Hui, C. L. Principles of behavior. New 
York: D. Appleton-Century, 1943. 

5. Hutt, C. L. Behavior postulates and cor- 
rollaries—1949. Psychol. Rev., 1950, 57, 
173-180. 

6. Hutt, C. L. Essentials of behavior. New 
Haven: Yale Univ. Press, 1951. 


— 


13. 


14. 


16. 


17. 


. Karn, H. W., & Porter, J. M., Jr. The 


effects of certain pre-training procedures 
upon maze performance and their sig- 
nificance for the concept of latent learn- 
ing. J. exp. Psychol., 1946, 36, 461-469. 


. Kenpter, H. H. Some comments on 


Thistlethwaite’s perception of latent 
learning. Psychol. Bull., 1952, 49, 47-51. 


. Linpquist, E. F. Statistical analysis in 


educational research. New York: Hough- 
ton Mifflin, 1940. 


. MacCorquopate, K., & Meent, P. E. On 


the elimination of cul entries without 
obvious reinforcement. J. comp. physiol. 
Psychol., 1951, 44, 367-371. 


. McNemar,Q. Psychological statistics. New 


York: Wiley, 1949. 


. Meent, P. E., & MacCorquopate, K. A 


failure to find the Blodgett effect and 
some secondary observations on drive 
conditioning. J. comp. physiol. Psychol., 
1951, 44, 178-183. 

Munn, N. L. Handbook of psychological 
research on the rat. New York: Houghton 
Mifflin, 1950. 

Pern, C. T. A quantitative investigation 
of the delay of reinforcement gradient. 
J. exp. Psychol., 1943, 32, 37-52. 


. Reyno.ps, B. A repetition of the Blodgett 


experiment on “latent learning.” J. exp. 
Psychol., 1945, 35, 504-516. 

Totman, E. C. Purposive behavior in ani- 
mals and men. New York: Century, 
1932. 

Toman, E. C., & Honzix, C. H. Degrees 
of hunger, reward and non-reward, and 
maze learning in rats. Univ. Calif. Publ. 
Psychol., 1930, 4, 241-256. 


. Toman, E. C., & Honzix, C.H. Introduc- 


tion and removal of reward and maze 
performance of rats. Unio. Calif. Publ. 
Psychol., 1930, 4, 257-275. 


(Received January 25, 1954) 





apd of Bap portenentel Psychology 
a 48, No. 3, 1954 





A NOTE ON THE BALLARD REMINISCENCE PHENOMENON 


HELEN AMMONS! AND ARTHUR L. IRION 


Tulane University 


The reminiscence phenomenon was 
discovered by Ballard (1) in 1913. In 
a series of experiments, he demon- 
strated that the memory of children 
for incompletely learned poetry tended 
to increase for a period of several days 
following the cessation of practice. 
Amount of reminiscence appeared to 
be a function of chronological age and 
meaningfulness of the material learned. 
Younger Ss showed more reminiscence 
than did older ones, and the amount of 
reminiscence obtained was greater for 
the more meaningful material. Bal- 
lard’s results were subsequently corro- 
borated by several investigators (2, 
3, 6). 

All of these early experiments on 
reminiscence, however, suffered from a 
methodological flaw. The Ss in the 
experimental groups practiced on the 
task to be learned, were then given an 
immediate retention test, were allowed 
to rest for an appropriate period, and 
were then given a second test of 
retention. The possibility exists, 
therefore, that the increase in reten- 
tion labeled reminiscence by Ballard 
and others was, in fact, merely the 
result of practice received while Ss 
were taking the first retention test. 
Ward’s (5) study in 1937 corrected 
this flaw of experimental design, and 
all subsequent experiments in the field 
of reminiscence have employed Ward’s 
basic design. Under this procedure, 
reminiscence is defined in terms of a 
comparison of the scores of two (or 
more) groups; an experimental group 
which practices, rests, and is given a 
delayed retention test, and a control 
group which practices and then is 


1 Now with the VA Center, Wood, Wisconsin. 


given an immediate test of retention. 
McGeoch (5) refers to two types of 
reminiscence based on this methodo- 
logical difference, the Ballard- 
Williams type and the Ward-Hovland 
type. 

Using the improved experimental 
design, Ward, Hovland, and many 
other investigators obtained amounts 
of reminiscence far smaller than the 
amounts obtained by Ballard and the 
other early investigators. Further- 
more, in the later experiments, the 
memory gains appeared to persist 
only for a few minutes rather than for 
a period of several days. It has been 
customary to explain these differences 
in the amount and duration of remi- 
niscence in terms of the methodo- 
logical differences we have just des- 
cribed. However, other important 
differences in procedure exist between 
the early Ballard-Williams type 
studies and the later Ward-Hovland 
type experiments. For example, 
Ballard and Williams both found, as 
noted above, that amount of remi- 
niscence decreased with increasing age 
of Ss. In the Ward-Hovland type 
studies, college students have almost 
invariably been used as Ss. Thus, on 
the basis of the earlier findings re- 
garding the relationship between 
amount of reminiscence and chrono- 
logical age, we would not haveexpected 
that much reminiscence would have 
been obtained in the later studies. 
Also, Ballard found reminiscence to 
decrease as meaningfulness of the 
material decreased. The Ward- 
Hovland type studies have tended to 
use nonsense syllable learning rather 
than the learning of poetry or prose. 
Again, on the basis of the earlier 


184 


























BALLARD REMINISCENCE PHENOMENON 185 

















TABLE 1 
Numser or Lines REcatiep 1n ReTentTION Tests 
First Test Second Test 
Condition N 

Time Mean SD Time Mean SD 

I 26 Immediate 9.23 4.95 

II 26 2 days 6.81 4.76 

III 23 7 days 6.13 3.41 
IV 26 Immediate 9.73 4.43 2 days 10.05 5.78 
V 23 Immediate 9.00 5.06 7 days 7.96 4.91 























findings, not much reminiscence 
should have been obtained in the later 
studies. Thus, the possibility re- 
mains that the high amounts of 
reminiscence obtained by Ballard may 
not be due entirely to the uncontrolled 
effects of the first retention test. The 
present experiment was designed to 
repeat Ballard’s work in its essential 
details, both with and without the 
proper experimental controls. 


PROCEDURE 


Subjects —The 130 Ss in this experiment were 
drawn at random from the population of 491 
seventh grade pupils enrolled in the Istrouma 
Junior High School in Baton Rouge, Louisiana? 
These Ss were assigned at random to five ex- 
perimental groups. Because of illness, dropouts 
from school, etc., 6 Ss were lost from the experi- 
ment. It is not thought that this constituted 
an important source of sample bias. 

Materials.—As in Ballard’s experiment, poetry 
was selected as the material to be learned in this 
study. Preliminary testing with seventh grade 
children from another school led to the selection 
of the first 24 lines of “The Spider and the Fly” 
as being of an appropriate level of difficulty. 
This poem was learned in a group situation by 
the method of complete presentation. The 
poem was projected on a screen in front of Ss 
(the visual aids room of the school was employed 
for the learning and testing sessions), E read the 
poem aloud once, following which Ss and E 
read the poem aloud and in unison for eight 
repetitions. 

Testing procedures.—Retention was tested by 
having Ss write out the poem on a special form 
on which 24 lines had been ruled, one for each 


2 We should like to take this opportunity to 
thank Mr. L. Norman Day, principal of the 
Istrouma Junior High School, and his staff for 
heir help and cooperation. 


line of the poem. Number of lines correctly 
reproduced was taken as the score. Perfect 
reproduction of a line was required with the 
exception that punctuation and spelling were 
ignored and that such abbreviations, expansions, 
or contractions as “you’re” for “you are,” “it’s” 
or “it is” for “’tis,” “never” for “ne’er,” “I will” 
for “T’ll,” “in” or “to” for “into” were con- 
sidered correct. 

Experimental design—The Ss were assigned 
at random to the following five experimental 
conditions: (I) Original learning (OL), immedi- 
ate test of retention; (II) OL, 2-day rest, test of 
retention; (III) OL, l-week rest, test of reten- 
tion; (IV) OL, first test of retention, 2-day rest, 
second test of retention; (V) OL, first test of 
retention, l-week rest, second test of retention. 

It will be seen that Cond. IV and V parallel 
Ballard’s procedure, while Cond. I, II, and III 
provide a properly controlled experiment for the 
determination of reminiscence using the same 
time intervals. In the case of the delayed-recall 
groups (II, III, IV, and V), none of the Ss was 
told that the delayed test was to be made. It 
was felt, however, that if these delayed tests 
were run at different times, rumor would serve 
to alert Ss who received the later retention tests. 
Because of this, the groups learned at different 
times, but all were tested for retention on the 
same day. 


RESULTS 


The means for the various condi- 
tions are presented in Table 1. It 
will be recalled that Cond. IV and V, 
as well as Cond. I, provide for an 
immediate test of retention. Since 
these means were obtained under 
comparable conditions, they may be 
combined to provide a more stable 
estimate of level of immediate recall. 
The resulting combined mean is 9.33. 
With this estimate of immediate recall, 





186 


TABLE 2 


PERCENTAGE OF RecALL AFTER Two 
Days anp One WEExK* 











Eff f Fin aT — 
Retention | poco iret vest .| Ballard’s 
First Test | Uncontrolled 
Interval Controlled | (Ballard’s |° Results 
Procedure) 
2 days 73 107 108 
7 days 65 85 87 














* The percentages were computed by using level of 
immediate recall as 100%. 


the figures in Table 2 were derived. 
This table presents percentage of 
recall after two days and one week 
when proper controls were in effect 
(first column) and under Ballard’s 
conditions (second column). It will 
be seen that “‘reminiscence” was ob- 
tained only when the effects of the 
immediate test of retention were un- 
controlled. When the proper con- 
trols were introduced, a negatively 
accelerated retention curve was ob- 
tained. The loss of retention, under 
these circumstances, was significant at 
the 2% level of confidence after two 
days and at the 1% level after seven 
days. On the other hand, the rise in 
retention obtained under Cond. IV 
was not significant. The difference 
between retention after two days with 
and without the proper controls 
(Cond. II vs. Cond. IV) was signifi- 
cant at the 5% level. (All signifi- 
cance figures are in terms of t tests for 
related or unrelated measures as might 
be appropriate.) 

It is interesting to compare the 
results of this experiment with those 
obtained by Ballard for the same time 
intervals. The third column of Table 
2 shows Ballard’s results for retention 
of “The Ancient Mariner.” It will 
be seen that Ballard’s results are 
closely duplicated in the present 
experiment. This implies that the 
failure to obtain reminiscence in 
Cond. II and III of the present experi- 





HELEN AMMONS AND ARTHUR L. IRION 


ment cannot be attributed to ex- 
traneous differences in procedure as 
between this experiment and Ballard’s 
investigation. Rather, the implica- 
tion is that Ballard’s results must be 
attributed to his failure to control for 
the beneficial effects of the immediate 
test of retention. 


SUMMARY AND CONCLUSIONS 


It has long been suspected that the large 
amounts of reminiscence found by Ballard, 
Williams, and others were attributable to a flaw 
in the experimental design used by these in- 
vestigators. However, it was possible that the 
differences in the results of these early studies 
and the later ones by Ward, Hovland, and others 
could be attributed to other differences in experi- 
mental procedure, and, in fact, from the findings 
of Ballard and Williams, it was possible to predict 
that only small amounts of reminiscence would 
be obtained when nonsense material was learned 
by adult Ss. Accordingly, the present study 
was designed to attempt to reproduce Ballard’s 
results under his conditions, and then to investi- 
gate the effects of introducing the proper con- 
trols. It was possible to reproduce Ballard’s 
results with considerable precision. However, 
when the proper controls were introduced, the 
“reminiscence” effect vanished. There is a 
strong presumption, therefore, that the results 
obtained by Ballard (1), Huguenin (2), Williams 
(6), and G. O. McGeoch (3) are spurious and 
that reminiscence obtained under this type of 
design is an artifact of the experimental method 
employed. 


REFERENCES 


1. Batitarpv, P. B. Obliviscence and remi- 
niscence. Brit. J. Psychol., Monogr. 
Suppl., 1913, 1, No. 2. 

2. Hucuentn, C. Reviviscence paradoxale. 
Arch. d. Psychol., 1914, 14, 379-383. 

3. McGeocn, G. O. The conditions of remi- 
niscence. Amer. J. Psychol., 1935, 47, 
65-87. 

4. McGeocn, J. A. The psychology of human 
learning. New York: Longmans, Green, 
1942. 

5. Warp, L. B. Reminiscence and rote learn- 
ing. Psychol. Monogr., 1937, 49, No. 4 
(Whole No. 220). 

6. Witutams, O. A study of the phenomenon 
of reminiscence. J. exp. Psychol., 1926, 
9, 368-387. 


(Received February 1, 1954) 

















“_— wee OF PP OD 








Journal of Experimental Psychology 
Vol. 48, No. 3, 1954 


KNOWLEDGE OF RESULTS IN THE ACQUISITION 
AND TRANSFER OF A GUNNERY SKILL! | 


MYMON GOLDSTEIN AND CARL H. RITTENHOUSE? 
Armament Systems Personnel Research Laboratory, AF Personnel and Training Research Center 


Sighting performance on the ped- 
estal sight gunnery station consists 
in keeping a reference dot on the nose 
of an attacking aircraft (tracking) 
and simultaneously adjusting a circle 
of dots so that it exactly encloses the 
aircraft’s wingtips(ranging). Persons 
only passingly familiar with this 
operator task might take it for granted 
that knowledge of results would be 
inherent in the visual display pre- 
sented. The operator sees the air- 
craft and also sees the dots he at- 
tempts to position properly ; therefore, 
he might be expected to know whether 
or not he is doing the job correctly. 
Actually, experimental study of the 
task in question has indicated that 
knowledge of results is not complete, 
and that the operator has some 
difficulty in assessing his own per- 
formance, particularly with respect to 
ranging. This difficulty in perform- 
ance assessment probably results 
chiefly from two factors: (a) The 
“circle” for ranging is only an ima- 
ginary one, since it consists of dots 
with relatively large spaces between 
them. There is reason to believe that 
operators do not always estimate the 
boundaries of this imaginary circle 
properly (1). (b) The target aircraft 
consistently changes in position and 
size, often rapidly, thus making 
possible for the operator only a vague, 


1The data reported in this paper were col- 
lected as part of the United States Air Force 
Personnel and Training Research and Develop- 
ment Program. The opinions or conclusions 
contained in this report are those of the authors. 
They are not to be construed as reflecting the 
views or indorsement of the Department of the 
Air Force. 

2 Now on the staff of the Army Field Forces 
Human Research Unit No. 2, Fort Ord, Cali- 
fornia. 


general impression of how well he is 
doing. 

Because of the operator’s incom- 
plete knowledge of results, attempts 
have been made to improve perform- 
ance by using supplementary cues as 
an aid in pedestal sight gunnery train- 
ing. The cues receiving most sys- 
tematic attention to date have been 
those coincident with sighting per- 
formance, those that occur throughout 
the period that the operator is on 
target (within specified tolerances) 
and fail to occur when he is in error. 
A cue of this type used frequently is a 
reddening of the target, produced by 
a filter; another, less frequently used, 
is the sounding of a buzzer. The 
effects of the two cues have been found 
to be similar (7). 


Studies employing cues coincident with sight- 
ing performance have been reported by Under- 
wood (10), Houston (4), Morin and Gagné (7), 
and Bilodeau (2). It has been found that in the 
presence of these cues there is a marked gain in 
performance, which results chiefly from im- 
provement in the ranging part of the task, but 
that immediately upon cue removal a large por- 
tion of this gain vanishes. There is some dis- 
agreement about whether or not any of the gain 
persists after cue removal. Several of the au- 
thors cited have suggested that the rise in scores 
concomitant with cue presentation may be at- 
tributed largely to the development of a dis- 
crimination habit different from the one which 
must be used by the gunner on a pedestal sight 
under ordinary conditions. In the absence of a 
supplementary cue, the gunner has no choice but 
to make an on-target vs. off-target judgment on 
the basis of the usual visual display presented by 
the sight. In the presence of a supplementary 
cue, however, the gunner is capable of perform- 
ing well by relying chiefly on a cue-occurrence vs. 
cue-nonoccurrence discrimination with only 
minimal attention to aspects of the usual visual 
display. This hypothesis offers an attractive 
explanation for the marked decline in scores that 
has occurred immediately upon removal of the 
filter or buzzer in all past studies. Since a large 


187 








188 


portion of the improvement in performance 
which accompanies supplementary cue presenta- 
tion is attributed to a cue-occurrence vs. cue- 
nonoccurrence discrimination, it is to be expected 
that removing the possibility of making such a 
discrimination would cause an immediate 
marked decline in scores. 

The series of studies reported below 
was designed to answer the following 
questions: (a) Does the use of a 
supplementary cue coincident with 
sighting performance result in super- 
ior ability after cue removal? (bd) If 
superior ability does result, is it 
transferable from one training device 
to another? (c) Does amount of cue 
presentation affect performance, as 
suggested by the studies of Houston 
(4) and Bilodeau (2), and is pattern 
of selection of trials for cue presenta- 
tion (random vs. systematic) a rele- 
vant variable? 

A further objective of the present 
studies was the exploratory investi- 
gation of types of knowledge of results 
not coincident with sighting perform- 
ance. Inasmuch as there had been 
criticism of the coincident cues on the 
grounds that they induce a habit 
which competes with the one that 
should be taught, it was decided to 
investigate the efficacy of other meth- 
ods of communicating information re- 
garding performance which were not 
likely to induce such a competitive 
habit. The methods selected were 
presented at the conclusions of training 
trials, and consisted of spoken state- 
ments evaluating the preceding per- 
formance. Investigation of these 
methods was designed to provide an- 
swers to questions paralleling the ones 
stated in the preceding paragraph. 


MeETHOD 


Apparatus.—Two devices were used : the SAM 
Pedestal Sight Manipulation Test (PSMT), 
which has been described by Melton (6), and 
the Flexible Gunnery Research Device (FGRD), 
a redesigned version of the PSMT, described by 
Spieth (9). Both of these devices simulate the 
tracking-ranging task of the pedestal sight 
gunner. Pistol grips were used; these were 


similar to control grips “B” discussed by 
The goal in tracking 


Johnson and Milton (5). 





MYMON GOLDSTEIN AND CARL H. RITTENHOUSE 


was to keep a reference dot on the nose of an 
attacking aircraft by rotating the sight head to 
the left or right (azimuth) and up or down 
(elevation); the goal in ranging was to keep a 
circle of dots exactly around the wingtips of the 
aircraft by pressing inward on a spring mecha- 
nism with the right hand. When performance 
is accurate, the gunner is usually expected to 
trigger as well, but the triggering portion of the 
task was eliminated in the present studies. The 
target-presentation mechanism on both devices 
is a projector which casts a moving aircraft image 
onto a curved screen. 

Performance is measured electronically by a 
group of clocks, each of which is activated when 
S is accurate with respect to the component of 
the task (azimuth, elevation, or range) or the 
combination of components to which the clock 
is set to respond. In the present studies, scores 
were recorded for time on target simultaneously 
in azimuth, elevation, and range (AER), and in 
range alone (R). Two additional clock scores 
were available to E on the PSMT: the time per 
trial that the ranging circle was too large and 
the time it was too small. These two scores 
were used only for knowledge-of-results pro- 
cedures, however, and served no purpose after 
data collection had been completed. 

A trial on the PSMT consisted of eight target 
attack flights, for which clock scores were 
summed. Most of the studies were conducted 
on a single PSMT unit, whose trial length was 
about 1.69 min. (actual scorable attack flight 
time, disregarding interattack intervals); how- 
ever, one of the studies, which will be designated 
Study 2 in the following section, was run in its 
entirety on a different unit, whose trial length 
was approximately 1.67 min. 

On the FGRD, there is no inherent counter- 
part of the PSMT trial, and attacks are selected 
singly out of a total of 96 possibilities. In order 
to increase the similarity between the tasks pre- 
sented by the two devices, only eight of the 
possible attacks were used, and these occurred 
in a fixed order. A series of all eight attacks was 
considered a single trial, and trial length was 
about 1.25 min. of attack flight time. Clock 
scores for the attacks within a trial were summed ? 


* The attacks chosen had the following pat- 
terns, which are given in the order of their occur- 
rence and are based on a system of nomenclature 
devised in an earlier study (8): 41£3R2S2, 
A2E4R1S1, A3E3R1S1, A2E1R1S2, A3E4R2S1, 
A4E2R1S2, A4E2R2S1, and ALE1R2S82. Al 
and A? refer to azimuth movement from left to 
right; 43 and 44 to azimuth movement from 
right to left. £1, £3, and £4 represent elevation 
movement at three different rates in a downward 
direction; £2, elevation movement in an upward 
direction. Rl and R2 are two closely similar 
patterns of movementin range. S1, and S2 rep- 
resent attack durations of .214 and .099 min., 
respectively. 

















GUNNERY SKILL 189 


Accuracy on the PSMT and the FGRD is a 
function of “scoring areas,” or tolerances, preset 
into the apparatus by E. In the present studies, 
scoring areas on the PSMT units were 14%; in. 
(22 mils) in azimuth, 1} in. (27 mils) in elevation, 
and §-in. circle diameter (10.5 mils) in range. 
The scoring areas on the FGRD were those corre- 
sponding to the settings numbered 2, i.e., in 
azimuth and elevation, about 25 mils limit from 
to limit, and in range, expressed as the ratio of 
the diameter of the inner scoring limit to the 
target wingspan, or the target wingspan to the 
diameter of the outer scoring limit, about .82. 
It will be noted that the range agea was fixed 
on the PSMT, but varied as a function of target 
wingspan on the FGRD. 

Design and procedure —Knowledge of results 
was presented in three forms: a buzzer, a verbal, 
and a tuition treatment. The buzzer sounded 
continuously during trials whenever S was on 
target. ‘The verbal treatment consisted of state- 
ments at the conclusions of trials, telling S what 
proportion of time he had remained on target, 
comparing performance on different parts of 
attacks, and comparing S’s most recent per- 
formance with his past record, all in terms of 
AER time on target, without mention of error 
tendencies. The tuition treatment was also pre- 
sented at the conclusions of trials, and consisted 
of statements concerning S’s specific error 
tendencies, as well as all information constituting 
the verbal treatment. Error discussion was 
limited to ranging, which is by far the most 
difficult component of the task. The £ based 
his analysis upon three ranging clocks, one of 
which ran when S was accurate, and the other 
two when his circle of dots was either too large 
or too small. 

The verbal treatment was studied separately 
from the tuition treatment in order to determine 
whether time-on-target information alone, with- 
out mention of specific error tendencies, could 
result in improved performance. It was felt 
that this limited information would at least 
indicate to S whether or not gross error was 
present, and perhaps enable him to discover his 
own error tendencies. It was also felt that the 
verbal treatment might increase S’s motivation. 
Since both the verbal and tuition treatments 
were presented at the conclusions of trials, they 
could only begin to influence performance on the 
trials immediately following. For convenience, 
these treatments will be said to have occurred 
on the first trials whose performance they could 
influence. 

The design is summarized in Table 1. Five 
studies were conducted. These studies were in 
many respects very similar to each other. The 
replicative aspect of the studies was helpful in 
deriving a coherent, dependable picture of most 
effects resulting from the experimental treat- 
ments. Each study involved an initial trial 
without knowledge of results, followed by a 
series of trials during which knowledge of results 


occurred. The series lasted for 38 trials in 
Studies 1-4 and 34 trials in Study 5. All studies 
ended with one or more trials for which knowl- 
edge of results was absent. In Studies 1-3, 
Trial 40 was the only such trial; in Study 4, 
there were 21 trials of this kind, numbered 
40-60; and in Study 5 there were 17 such trials, 
numbered 36-52. Trials 37-52 of Study 5 were 
presented on the FGRD. The remaining trials 
of Study 5 and all trials of Studies 1-4 were 
presented on the PSMT. 

During Studies 1 and 2 variation was intro- 
duced in amount and pattern of treatment 
presentation. (By “pattern” is meant random 
vs. systematic selection of trials on which a 
treatment was operative.) Within the series of 
trials devoted to knowledge of results, treatments 
occurred on 100% of trials, alternate trials, or 
50% of trials atrandom. Work in this area was 
discontinued after Study 2 because little effect 
upon Ss’ performance resulted. 

Group designation in Table 1 consists of a 
number, representing the study, followed by one 
or two letters, representing the treatment: C 
indicates a control group; B, a 100% buzzer 
group; AB, an alternate buzzer group; RB, 
random buzzer; V, 100% verbal; RV, random 
verbal; and T, 100% tuition. 

Each of the studies required four to six weeks 
for its administration. During any week, either 
12 or 16 Ss were used. The Ss were divided into 
subgroups of four, and within subgroups were 
each assigned to a differertt one of the four groups 
involved in that particular study. Care was 
taken to have all Ss of one subgroup succeed 
each other on the training device before members 
of any other subgroup were introduced. Within 
subgroups, the sequence in which Ss took their 
turns was determined by randomization. 

Instructions were similar to those described 
by Melton (6), with some minor changes to 
adapt them to present conditions. 

Subjects—The Ss were airmen stationed at 
Lowry Air Force Base. They had had no 
previous experience of any kind with the pedestal 
sighting station. The number of Ss who com- 
pleted the experimental program for each group 
during each study appears in Table 1. Groups 
were of unequal size within studies, even though 
equal groups should have resulted from the 
procedure outlined in the previous section. The 
cause of the discrepancy is that a number of Ss 
dropped out while the studies were in progress 
for reasons of illness or other administrative 
necessity. 


RESULTS 


Scores obtained in the five studies 
appear in Fig. 1-5. The points 


plotted are for AER (simultaneous 
tracking and ranging accuracy), and 
represent mean time on target per 





190 


MYMON GOLDSTEIN AND CARL H. RITTENHOUSE 























TABLE 1 
EXPERIMENTAL DESIGN 
Study Amount of Practice Group| N Treatment Occurrence of Treatment* 
1 | 40 trials, PSMT, at the rate of | 1C 16 | Control 
8 per day (no cue) 
1B 15 Buzzer | Trials 2-39 
1AB | 16 | Buzzer | Alternate trials, beginning with 
3 and ending with 39 
IRB | 16 Buzzer | Trials 3, 4, 5, 6, 7, 8, 11, 12, 13, 
15, 19, 22, 24, 26, 27, 31, 33, 37, 
39, ic, 50% of trials, at 
random 
2 | 40 trials, PSMT, at the rate of | 2B 14 Buzzer | Trials 2-39 
8 per day 2RB | 16 | Buzzer | Trials 2, 3, 4,6, 10, 12, 13, 16, 17, 
18, 20, 22, 23, 26, 33, 34, 36, 37, 
39, ie., 50% of trials, at 
random 
2V 15 Verbal Trials 2-39 
2RV | 15 Verbal Same randomization as for 
Group 2RB. 
3 | 40 trials, PSMT, at the rate of | 3C 18 Control 
8 per day (no cue) 
3B 18 Buzzer | Trials 2-39 
3V 16 Verbal Trials 2-39 
3T 16 Tuition | Trials 2-39 
4 | 60 trials, PSMT, at the rate of | 4C 14 | Control 
12 per day (no cue) 
4B 15 Buzzer Trials 2-39 
4V 16 Verbal Trials 2-39 
4T 14 Tuition | Trials 2-39 
5 36 trials, PSMT, at the rate of | 5C 19 Control 
12 per day, followed by 16 (no cue) 
trials, FGRD, at the rate of 8| 5B 17 Buzzer Trials 2-35 
per day 5V 17 | Verbal Trials 2-35 
5T 16 | Tuition | Trials 2-35 




















* The buzzer was used during the trials indicated; verbal and tuition treatments were presented at the con- 
clusions of the trials immediately preceding those indicated. 


trial per group, expressed as a per- 
centage of maximum possible time. 
Tests of statistical significance that 
have been applied to the 4ER data 
are summarized in Table 2. The 
analysis of variance technique has 
been used throughout, after a pre- 
liminary check with the Bartlett test 
for homogeneity of variance. For 
every analysis in the table, the homo- 
geneity of variance assumption is 
tenable. In all studies, a test has 
been applied to scores on Trial 1, 
which preceded knowledge of results, 
and on the trial after cessation of 
knowledge of results. In Study 4, an 
additional test has been applied to 
scores on Trial 60, which concluded 


the final series of 21 trials without 
knowledge of results; and in Study 5 
there are tests at the beginning and 
end of practice on the second trainer. 

Equivalence of groups.—The tests 
on Trial 1 have been used to check the 
assumption that groups were roughly 
equivalent before experimental treat- 
ment. This assumption is acceptable, 
since it is shown in Table 2 that 
differences between groups on Trial 1 
were insignificant in each of the 
studies. Trial 1 scores differed some- 
what from study to study, probably 
as a result of systematic changes in 
the PSMT apparatus. Therefore, ab- 


solute level of performance will be 




















~~ we 





GUNNERY SKILL 191 


SCORE 
$$ § 


? 


? 





PER CENT OF MAXIMUM 





@———® 100 % BUZZER (GP. 18) 

@—-—® 50% ALTERNATE BUZZER (GP IAB) 
O--------O 50% RANDOM BUZZER (GP. IRB) 
*————-X_ CONTROL (GP IC) 





ee ee ee ee a ee ee 


10 


40 


PRACTICE TRIALS 


Fic. 1. Performance curves for Study 1. 


de-emphasized in 
parisons. 

Effects of components on AER 
scores.—Although the results which 
follow are for AER scores, represent- 
ing the composite tracking-ranging 
task, the influence of the buzzer, 


interstudy com- 


Practice was given at the rate of eight trials per day. 


verbal, and tuition treatments was 
principally upon the R component. 
Clock scores for R alone were available 
throughout the studies performed, and 
reacted to the experimental treat- 
ments much as the AER scores. 
Tracking scores were not recorded, but 


























TABLE 2 
ANALYSES OF VARIANCE* 
Between Groups Within Groups 
Study Trial F ? 
af Mean Square df Mean Square 

1 1 3 30192.3 59 34125.2 <i 

40 3 349163.1 59 43575.6 8.01 <.001 
2 1 3 4152.3 56 14080.7 <1 

40 3 216346.2 56 28756.9 7.52 <.001 
3 1 3 20478.3 64 30696. 1 <1 

40 3 447072.9 64 48104.6 9.29 <.001 
4 1 3 7436.4 55 17648.0 <i 

40 3 295147.0 55 43097.2 6.85 <.001 

60 3 75092.4 55 56117.4 1,34 >.05 
5 1 3 26653.8 65 19050.3 1.40 >.05 

36 3 205338.9 65 32849.5 6.25 <.001 

37 3 17195.0 65 24661.7 <1 

52 3 92579.7 65 46968.6 1.97 >.05 


























* Mean squares are based on scores expressed in thousandths of a minute. The assumption of homogeneity 
of variance is tenable in all cases. 











192 


? 


? 


? 


PER CENT al ree 2 SCORE 
re) 
l 





0 





MYMON GOLDSTEIN AND CARL H. RITTENHOUSE 





o———® 100% BUZZER 
O----e-0 50% RANDOM 
Lewreeces 4 100% VERBAL (GP. 2V 

a———4 50% RANDOM VERBAL (OP 2RV) 


(GP. 28) 
oe oe, (GR 2RB) 





rrTrrrryrrvTyrrrrrTrrrr+rT+t+tyttitt+tttttt? ttt. tT) tt 
5 10 30 35 


40 


PRACTICE TRIALS 


Fic. 2. Performance curves for Study 2. Practice was given at the rate of eight trials per day. 


an AE measure was estimated for a 
portion of the studies from the avail- 
able R and AER scores by means of a 
method introduced by Ellson (3). 
The AE scores exhibited little or no 
change which could be attributed to 
the treatments. 

Treatment amount and pattern.— 
Amount and pattern of treatment 
presentation seems to have made 
little difference. The scores. of 
Groups 1B, 1AB, and 1RB in Fig. 1 
differed only slightly and insignifi- 
cantly on Trial 40, which was the 
first trial after buzzer removal; Groups 
2B and 2RB in Fig. 2 received almost 
identical scores on this trial. Differ- 
ences during earlier trials were also 
not very great for buzzer groups that 
agreed with respect to treatment 
presence or absence, and do not seem 
worthy of detailed scrutiny. Varia- 


tion in amount and pattern of treat- 
ment presentation was represented in 
only one of the nonbuzzer groups, 
Scores for Group 


2RV in Fig. 2. 


2RV were virtually identical with those 
for Group 2V. This finding, coupled 
with the results of Study 1, brought 
about elimination of the treatment 
amount and pattern variables from 
the succeeding studies. 

Treatment introduction and re- 
moval.—The usual dramatic effects of 
buzzér introduction and removal may 
be observed in the curves for the buz- 
zer groups. An immediate sharp rise 
in scores always accompanied the in- 
troduction of the buzzer, and an im- 
mediate sharp decline accompanied its 
removal. This resulted in the alter- 
nating peaks and troughs of the curves 
for Groups 1AB and 1RB in Fig. 1 and 
2RB in Fig. 2. For Groups 1B, 2B, 
3B, and 4B in Fig. 1-4, respectively, 
the effects were on Trials 2 and 40, 
and for Group 5B in Fig. 5 on Trials 
2 and 36. Use of the verbal and 
tuition treatments produced effects 
which were, on the whole, much more 
gradual than those related to the 
buzzer. In no instance did cessation 























GUNNERY SKILL 193 


of these treatments result in a marked 
decline in scores. Treatment intro- 
duction did produce sharp rises in the 
curves for Group 3T in Fig. 3 and 
Groups 4V and 4T in Fig. 4, but only 
in the case of Group 3T was there no 
similar rise for the corresponding 
control group. Moreover, treatment 
introduction and removal were not 
reflected in peaks and troughs for 
Group 2RV in Fig. 2, as they were for 
the AB and RB groups in Fig. 1 
and 2. 

Learning to first posttreatment trial.— 
Learning occurred, to a greater or 
lesser degree, in all groups of all 
studies. The B groups displayed a 
steady, moderate, upward trend in 
the presence of the buzzer, subsequent 
to the sharp rise which first accom- 


panied its introduction. After buzzer; 


removal, a sizable net gain was still 
present, despite the marked drop in 
scores. The AB and RB groups also 
showed improvement, both on trials 
for which the buzzer was operative 


? 


2 


MAXIMUM SCORE 
re) 
i 


? 





PER CENT OF 
° 


~ 

° 
n 
o- 


and on trials for which it was not. 
If the points representing performance 
of any AB or RB group in the pres- 
ence of the buzzer were connected to 
form one curve, and the points repre- 
senting performance of the same group 
in the absence of the buzzer were 
connected to form another curve, 
both curves would be relatively 
smooth and their shape fairly typical 
for the PSMT task. The V, RV, and 
T groups improved more slowly than 
the buzzer groups, and generally 
completed training with a smaller net 
gain in score. During the first quar- 
ter of the trials, learning was usually a 
little more rapid for the verbal and 
tuition groups than later on. The 
control groups yielded curves very 
similar to those for the verbal and 
tuition groups, but generally had 
scores a bit lower at the end of the 
training period. On the whole, all 
treatments were beneficial, in that 
they produced scores higher than 
those of the control group. The 





@———® 100% BUZZER (GP. 38) 
ar------0 100% VERBAL (GP. 3V) 
@-----0 100% TUITION GP. 3T) 
*“———"K_ CONTROL (GP. 3C) 





ee a | en ee ee a ae 
35 


PRACTICE TRIALS 


Fic. 3. Performance curves for Study 3. Practice was given at the rate of eight trials per day. 











194 MYMON GOLDSTEIN AND CARL H. RITTENHOUSE 





70-4 
tr 60- 
5 
- 
$40 Pa po ef ¥Y WU 
} 
& 304 pee J 
w j 
{5 204" 
a 


C2 
Pa] 
“m. 


o-—"8 100% BUZZER(GR 48) 
Geeccees& 100 % VERBALIGP 4V) 
Gre 100 % TUITIONGR 4T) 
——K CONTROLGR 4C) 





ea AEAPARELPAABAPSBLALAR IAEA LESSORS ABALAABR BAAS ER LEAR LSS BABS 
1! 5 io 6 2 3% 40 45 £50 


25 


30 


PRACTICE TRIALS 


Fic. 4. Performance curves for Study 4. Practice was given at the rate of 12 trials per day. 


buzzer groups clearly excelled, and 
there was usually little difference 
between the tuition and verbal groups, 
although such differences as occurred 
usually favored tuition. Analyses of 
variance performed on the intergroup 
differences for the first trial after 
cessation of knowledge of results 
(Trial 40 in Studies 1-4 and Trial 36 
in Study 5) yielded highly significant 
F ratios in all five studies. 

Effects of extended posttreatment 
practice.—The inclusion of 20 addi- 
tional trials on the PSMT after ces- 
sation of knowledge of results in Study 
4 produced the interesting data for 
Trials 41-60 in Fig. 4. During these 
trials, scores for Group 4B decreased, 
indicating that the habit induced by 
the buzzer was apparently undergoing 
extinction. Within the same period, 
scores for the remaining three groups 
did not undergo systematic’ change. 
On Trial 60, intergroup differences no 
longer produced a significant F ratio, 


although the buzzer group still scored 
highest, and all treatment groups still 
maintained an advantage over the 
control. 

Transfer—The transfer data of 
Study 5, represented by Trials 37-52 
of Fig. 5, modify the implications of 
the earlier studies considerably. It 
should be noted that the absolute 
levels of the scores for Trials 37—52 
are not to be compared with the levels 
of the scores for the earlier trials, 
because little more than the relative 
difficulties of the two trainer tasks 
appears to be involved. The impor- 
tant factor to consider is intergroup 
differences during the later trials. 
It may be seen immediately that the 
buzzer group scored lowest on every 
FGRD trial, although it had been 
highest on the PSMT, as in the earlier 
studies. The remaining three groups 


differed little from each other on the 
FGRD, and the slight advantage 
Groups 5V and 5T maintained over 








= 














— 





GUNNERY SKILL 195 


Group 5C toward the end of PSMT 
training was reversed. Analyses of 
variance for Trials 37 and 52 yielded 
insignificant F ratios. In short, the 
data indicate that none of the treat- 
ments used produced benefits which 
could carry over beyond the situation 
peculiar to the PSMT, and the buzzer 
treatment tended actually to be detri- 
mental to transfer. 


Discussion 


The obtained data indicate rather 
strongly that use of a buzzer to present 
knowledge of results for the pedestal 
sight gunnery task is, in general, not 
to be recommended. Presumably, the 
same conclusion may be applied to the 
red filter treatment, which has produced 
effects similar to those of the buzzer in 
an earlier investigation (7). The poor 
showing of the buzzer Ss when transfer 
between two devices simulating the 
same task was attempted implies that 
any apparent initial advantages re- 
sulting from use of the buzzer were as- 
sociated with performance variance 


? 





INITIAL TASK: PSMT 





specific to the PSMT employed in orig- 
inal training. The Ss in the buzzer 
groups must have learned to capitalize 
upon such specific situational factors as 
scoring area settings and common PSMT 
scoring idiosyncrasies, about which other 
Ss had no information whatsoever. 
Given the opportunity to make use of 
these factors, buzzer Ss did not employ 
their training sessions to develop trans- 
ferable habits to the same extent as the 
other Ss. As a result, replacement of 
the PSMT with the FGRD put buzzer 
Ss at a disadvantage from which they did 
not recover. Had a satisfactory air- 
to-air proficiency criterion been available 
for use in place of the FGRD, similar 
results would probably have been ob- 
tained. 

It is interesting that even when prac- 
tice continued on the PSMT for a while 
after cue removal, the advantages of the 
buzzer Ss began to dissipate (cf. Study 
4). Possibly, this is due to a motiva- 
tional decline, although why such a de- 
cline should have affected the buzzer 
groups more than the verbal and tuition 
groups is not clear. It is more likely 


FINAL. TASK: FGRD 





@——@ 100 % BUZZER (GR 5B) 
&--<0100% VERBAL (GP 5V) 
&---0 100% TUITION (GP 57) 
%*——K CONTROL (GP. 5C) 








.°) TUTTE Pree rrr rrr rrr err rr rrr rere 
' 5 © 


RRSP RRR Ve Pee eee 
40 45 50 


Fic. 5. Performance curves for Study 5. Trials 1-36 occurred at the rate of 12 per day; 
Trials 37-52 at the rate of 8 per day. 








that the decline represents actual for- 
getting on the part of the buzzer Ss. 
That forgetting should have occurred 
for these Ss is plausible if it is true, as 
suggested earlier, that they learned to 
capitalize upon various specific situa- 
tional factors to improve their scores. 
There are enough such factors connected 
with the PSMT to allow for forgetting. 

The status of the verbal and tuition 
treatments is somewhat in doubt. The 
Ss given these treatments maintained 
a small, consistent advantage over the 
control Ss in Studies 3 and 4, and this 
advantage held up during the extended 
practice after treatment removal in 
Study 4. Unfortunately, these treat- 
ments were of little, if any, real benefit 
during the first portion of Study 5, as 
a result of which the latter portion of 
this study cannot constitute a test of the 
transferability of the treatment effects. 
It is not known why Study 5 differed 
as it did. There is enough promise in 
the area as a whole to warrant further 
investigation, however. Any treatment, 
such as the verbal and tuition treat- 
ments, which is aimed at having S learn 
a few major principles of accurate sight 
manipulation, and which provides little 
opportunity to capitalize upon specific 
situational factors, should have a good 
chance of resulting in lasting, trans- 
ferable performance gains. 


SUMMARY 


Knowledge of results was presented in pedes- 
tal sight gunnery training by means of a buzzer 
operative during trials, or spoken evaluative 
statements at the conclusions of trials. Amount 
and “pattern” of knowledge of results made little 
difference. Buzzer introduction and removal 
were associated with sharp rises and drops in 
scores, attributed to a cue-occurrence versus cue- 
nonoccurrence discrimination; no such effects 
accompanied the spoken statements. After 
cessation of knowledge of results, buzzer groups 
showed greater improvement than all others, but 
this advantage tended to diminish with practice; 
groups trained with the spoken statements per- 
formed consistently better than control groups. 
When transfer to a second training device was 
required, the buzzer Ss were at a disadvantage. 
It was concluded that the buzzer Ss had learned 
to capitalize upon situational factors particular 


196 MYMON GOLDSTEIN AND CARL H. RITTENHOUSE 


to the original training device, and use of the 
buzzer was therefore not recommended. Defini- 
tive results were lacking with respect to the 
transferability of improvement resulting from 
the spoken statements. 


REFERENCES 


1. Bartuot, R. P. Errors in visual size- 
matching in the flexible gunnery task. 
USAF Hum. Resour. Res. Cent., Res. 
Note FG, 1952, No. 52-2. 

2. Biropeau, E. A. Some effects of various 
degrees of supplemental information given 
at two levels of practice upon the acquisi- 
tion of a complex motor skill. USAF 
Hum. Resour. Res. Cent., Res. Bull., 1952, 
No. 52-15. 

3. Extson, D.G. The independence of track- 
ing in two and three dimensions with the 
B-29 pedestal sight. AAF Air Materiel 
Command Engng Dio. Mem. Rep., 1945, 
No. TSEAA-694-2G. 

4. Houston, R.C. The function of knowledge 
of results in learning a complex motor skill. 
Unpublished master’s thesis, Northwest- 
ern Univ., 1947. 

5. Jounson, A. P.,& Mitton, J.L. An experi- 
mental comparison of the accuracy of 
sighting and triggering with three types 
of gun-sight hand grip controls. In P.M. 
Fitts (Ed.), Psychological research on 
equipment design. Washington: U. S. 
Government Printing Office, 1947. (AAF 
Aviat. Psychol. Program Res. Rep. No. 19.) 
Pp. 241-248. 

6. Metron, A. W. (Ed.) Apparatus tests. 
Washington: U. S. Government Printing 
Office, 1947. (AAF Aviat. Psychol. Pro- 
gram Res. Rep. No. 4.) 

7. Morin, R. E., & Gagné, R. M. Pedestal 
sight manipulation test performance as in- 
fluenced by variations in type and amount 
of psychological feedback. USAF Hum. 
Resour. Res. Cent., Res. Note P &% MS, 
1951, No. 51-7. 

8. Ritrennouse, C. H., & Go.tpstetn, M. 
Target flight characteristics as determi- 
nants of training transfer and task diffi- 
culty in flexible gunnery. USAF Hum. 
Resour. Res. Cent., in press. 

9. SpreruH, W. An exploratory study of oper- 
ator and apparatus characteristics of a 
flexible gunnery research device. USAF 
Hum. Resour. Res. Cent., Tech. Rep., 1952, 
No. 52-2. 

10. UnpERwoop, B.J. Experimental psychology. 
New York: Appleton-Century-Crofts, 
1949. 


(Received January 29, 1954) 











agar 


oe 














ee 





Journal of, Experimental Psychology 
Vol. 48, No. 3, 1954 


RATE RECOVERY IN A REPETITIVE MOTOR TASK AS 
A FUNCTION OF SUCCESSIVE REST PERIODS! 


EDWARD A. BILODEAU 
Skill Components Research Laboratory, Air Force Personnel and Training Research Center 


Decrements as a result of respond- 
ing are clearly apparent in so-called 
work studies (1, 2) and are also 
seriously considered in typical learning 
studies (5, 6, 7, 8). The,utility of 
experimental control over learning 
and work effects has been pointed out 
in two previous studies of which this 
investigation is an extension (1, 2). 
Since responding, and as a consequence 
inhibition, is common to both work 
and learning tasks, a study of work 
factors is of considerable general im- 
portance. 

Manipulation of distribution of 
practice is probably the most common 
operation in motor skills learning 
research. In such research, the most 
critical analysis often consists of 
relating recovery to one particular 
rest, i.e., a single rest interpolated 
after various durations of practice, or a 
single rest of variable duration with 
duration of prior practice held con- 
stant, etc. In no study with human 
Ss have the cumulative effects of 
successive rests been studied as such 
(3). It is the aim of the present 
experiment to study these effects 
with a work task. 

The generally accepted interpreta- 
tion of the learning type of study is 
that two processes operate during the 
practice periods: learning occurs and 
inhibition is augmented, the net effect 
typically being an increase in response 
strength because of the greater incre- 


1 The experimental work for this study was 
performed as part of the United States Air Force 
Human Resources Research and Development 
Program. The opinions or conclusions con- 
tained in this report are those of the author. 
They are not to be construed as reflecting the 
views or indorsement of the Department of the 
Air Force, 


ments in the learning component. 
During relatively brief rest, response 
strength is further increased through 
the dissipation of the decremental 
component, the learning component 
being relatively stable. The increase 
in strength of response after rest over 
that exhibited just prior to rest is 
sometimes called spontaneous re- 
covery or reminiscence. 

The within- and between-trial inter- 
pretation of performance sketched 
above is basically no different when a 
work task is considered, except that 
the incremental or learning component 
is assigned a constant value both with- 
in and between trials. If this assump- 
tion is valid, successive differences 
between performance measures reflect 
accumulations of the decremental 
components; if no rests have been 
interpolated, decreases in response 
strength are attributed to increases 
in accumulated decremental compo- 
nents, but if rests have been inter- 
polated, any increases in response 
strength are attributed to dissipation 
of decremental components. The re- 
covery associated with each rest, 
however, is unconfounded with the 
learning increment. In the usual 
methodology of the learning type of 
experiment, learning increments are 
confounded with dissipation of work 
decrements. 

A work task is used in the present 
study and, as such, presumably facili- 
tates the interpretation of work 
decrements. Response strength is 
measured in terms of rate of respond- 
ing. Rate, then, should vary in- 
versely with work period and directly 
with rest. Beyond this, however, are 
the purposes of determining more 


197 





_ 
oO 
oo 


10 SEC. 
a 
a 
' 


25 


20F 





? 


EDWARD A. BILODEAU 





Oe ee a aie te ce ats 





MEAN NO. OF REVOLUTIONS PER 


SUCCESSIVE 


Fic. 1. 


So Se a a ee 6 


Mean rate of responding is plotted against 30 successive scoring periods. 


i7 9 2! 23 25 27 2 


10 SECOND SCORING PERIODS 


Interpolated 


rests are represented by the breaks in the curves, except for Group 30-0 to which no rests were 


administered. 


precisely how rate varies as a function 
of rest and of describing the course of 
successive recoveries in rate of respond- 
ing with various resting durations. 
These determinations should allow a 
clearer understanding of rate behavior 
as decremental components are al- 
lowed to accumulate and partially 
dissipate over successive rests. 


METHOD 


Subjects.—A total of 270 basic airmen trainees 
at Lackland Air Force Base was divided equally, 
but unsystematically, into five groups. None of 
these Ss had received prior practice with the 
apparatus. 

Apparatus——The apparatus was a manual 
crank, a device previously described (1, 2). 
Essentially, the apparatus consisted of a crank 
handle with a turning diameter of 9 in. and a 
tachometer generator for exerting a braking 
force upon the crank shaft. The crank handle 
was mounted in the horizontal plane at the edge 
of a table so that the turning motion was in the 
vertical plane. A counter recorded a score of 
one upon the completion of each revolution of 
the handle. Of the several force variations 
available, only one was used; this force has been 


referred to previously as Load 4 (1, 2). Using 
this load in turning the crank at 40, 50, 60, 70, 
and 80 revolutions per 20 sec. requires .0285, 
.0410, .0545, .0690, and .0850 hp, respectively. 
Thus, horsepower increases rapidly with in- 
creasing rate of rotation. Cranking at any of 
these rates is quite strenuous and fatiguing. 

Experimental design and procedure.—Each of 
the five groups was given a total of ten 30-sec. 
trials at cranking as fast as possible. The 
groups were differentially treated with 0, 10, 30, 
90, or 180 sec. of interpolated rest after each 
30-sec. trial. In other words, a trial was held 
constant at 30 sec., while the intertrial rest was 
varied from group to group. Group 30-10, for 
example, represents the treatment with 30 sec. 
of practice and interpolated rests of 10 sec. 

The Ss of all five groups, standing in front of 
the handle and using a standard grip, were 
instructed to rotate the handle as fast as possible 
between go and stop signals. A 3-sec. ready 
signal was given before the signal to start. 
None of the Ss was told about the duration or 
number of the forthcoming practice periods, nor 
instructed about the length or number of the 
interpolated rest periods. Each S was tested 
individually. The treatments were scheduled in 
the order 1, 2, 3, 4, 5, 5, 4, 3,2, 1, ete. The data 


recorded consisted of the number of crank revo- 
lutions per successive 10-sec. portions of the 
30-sec. practice trial (three scores per trial). 




















REPETITIVE MOTOR TASK 199 


REsuLts? 


In Fig. 1 are plotted the mean num- 
ber of revolutions per 10 sec. against 
30 successive 10-sec. scoring periods. 
Within- and between-trial trends 
stand out clearly, for each trial is 
represented by three connected points 
and breaks in the curve represent 
successive rests (except for Group 
30-0). The data of Trial 1 were 
averaged over all five groups inasmuch 
as no differential treatment was ad- 
ministered during this trial; i.e., the 
single curve represents an N of 270 for 
Scoring Periods 1, 2, and 3. Figure 1 
shows quite clearly that the within- 
trial trend is negative for both control 
and experimental groups, or that rate 
of cranking decreased during any 
practicetrial. It should be noted that 
the within-trial decrement is greatest 
for the longest resting group and least 
for the shortest resting group, and 
that all other groups are fairly ap- 
propriately ranked between these two 
extremes. There is also a suggestion 
that the within-trial decrement de- 
creases slightly as a function of suc- 
cessive trials. Between-group com- 
parisons show more homogeneous 
rates of responding for terminal per- 
iods (last 10 sec. of a trial) than for 
starting periods (first 10 sec. of a 
trial). Further, after a rest or two, 
the terminal rate of the longest rest 
group remains fairly constant from 
trial to trial, but the terminal rates of 
the shorter rest groups appear to de- 
crease. Figure 1 also reveals that 
from Trial 3 (after two rests) onward, 
the level to which each rest group re- 


2 A 3-page table giving means for each of the 
groups has been deposited with the American 
Documentation Institute. Order Document 
No. 4302 from the ADI Auxiliary Publications 
Project, Photoduplication Service, Library of 
Congress, Washington 25, D, C., remitting in 
advance $1.25 for 35 mm. microfilm or $1.25 
for 6 by 8 in. photocopies. Make checks pay- 
able to Chief, Photoduplication Service, Library 
of Congress. 








ii} 
Chey 


G6. @:@. 
» 





$ 


MEAN WO. OF TURNS SPONTANEOUSLY 

















| 


REST TE W SECONDS 


Fic. 2. For each of nine successive rest 
periods amount of recovery is plotted as a func- 
tion of rest duration, each point representing an 
experimental group 


covered decreases by an amount 
similar to the decrements of the no- 
rest control. 

After the first rest, each rest group 
recovered markedly, though the longer 
the rest, the greater the recovery. 
For every trial, each group is appro- 
priately ranked from that with longest 
to that with least interpolated rest. 

Trend of successive recoveries —Each 
of the nine curves of Fig. 2 depicts 
recovery as a function of the duration 
of interpolated rest, the magnitude of 
recovery being calculated as the differ- 
ence between the first postrest score 
and the last prerest score. 

The curve for Rest 1 appears to be 
no different from the typical rest- 
recovery function; i.e., the function is 
increasing and negatively accelerated. 
The same function is descriptive of 
Rest 2, but between-group differ- 
ences are reduced. For subsequent 
rest periods, the typical function 
appears less and less appropriate, and 
by Rest 9 the trend no longer appears 
to be increasing and negatively accel- 
erated; i.e., the relative ranks of 








200 EDWARD A. BILODEAU 


Groups 30-30 and 30-90 have been 
reversed. Visual extrapolation be- 
yond Rest 9 suggests a point at which 
the between-group null hypothesis 
cannot be rejected. 

A better survey of the between- 
group recovery differences over suc- 
cessive rests can be obtained from 
Fig. 3, where the mean recovery index 
discussed above is now plotted as a 
function of successive rests and rest 
duration forms the parameter. In- 
spection of this figure shows clearly 
that for successive rests there is a 
progressively increasing magnitude of 
recovery for the shorter rest groups 
(30-10 and 30-30), and a progressively 
decreasing magnitude of recovery for 
the longer rest groups (30-90 and 
30-180). A common value is sug- 
gested somewhere beyond Rest9. An 
F test at Rest 9, however, gave a ratio 
of 5.74 with 3 and 212 df, necessitating 
the interpretation that differential 
recoveries are still manifest after 
nine rests. 

In order to test the hypothesis that 
the amount of recovery can be inde- 
pendent of rest duration, a replication 


7F 





MEAN NO. OF TURNS RECOVERED 





i i U 4 mM rm i 1 r 


. goes 3:9 £8? 
SUCCESSIVE REST PERIODS 


Fic. 3. Amount of recovery after rest is 
plotted against nine successive rest periods; 
duration of rest is the parameter 





of Groups 30-90 and 30-10 was run, 
except that the number of trials was 
extended to 20 (19 rest periods), and 
the data were arranged as in Fig. 3. 
The trends of recovery for the initial 
rests were identical with those of the 
original experiment. Group 30-10 
achieved its maximum at Rest 7 and 
thereafter declined slightly but pro- 
gressively along with Group 30-90. 
Beyond seven rests, the data gave no 
reason to assume _ between-group 
differences. A trend test was made 
on the recovery scores from Rests 7 
through 19. The F’s for Groups and 
for Groups X Rests were less than 
unity, but the F for Rests was signifi- 
cant beyond the .01 level (12 and 
1272 df). It is thus concluded that 
during the last 13 rests, the amounts 
recovered were equal and decreasing. 

The trial by trial changes in magni- 
tude of recovery exhibited in Fig. 3 
can be attributed in general to a de- 
creasing starting rate and a constant 
terminal rate for the longer rest groups 
but a terminal rate which decreases 
faster (during early rests) than a 
starting rate for the shorter rest 
groups. 

As previously calculated, recovery 
consisted of the score difference across 
rest, using intragroup data. A com- 
parison of the difference between rest 
groups and the no-rest control was 
made by using the rest group scores 
just after rest and comparable control 
group scores. Such an analysis com- 
pares successive differences in the be- 
havior of the rest groups with the 
control. The mean differences pre- 
sented in Fig. 4 are plotted against 
the duration of the rest, and each of 
the nine rests is represented. Each 
function has been plotted against the 
same scale and the rest values 10, 30, 
90, and 180 sec. Thecurves, however, 
are arbitrarily spaced from one an- 
other by a constant amount. In 
other words, each curve should be 








RR arta ML lg ALOE NA. 

















REPETITIVE MOTOR TASK 201 


16.0¢ 


rN 
A 
14.0F Ly > 4 5/6 7 8 
: ° 
fe) 


a aren re 














MEAN DIFFERENCE (CONTROL GROUP COMPARED) 


8.0F 
x 
6.0F = * 
an ® GROUP 
é @ 30-10 
xX 30-30 
2.0 © 30-90 
& 30-180 
ww nN 1 
10 30 90 180 


DURATION OF INTERPOLATED REST IN SEC. 


Fic. 4. Mean difference between experimental and control data is plotted against duration 
of interpolated rest; ordinal number of rest period is the parameter. Each curve is arbitrarily spaced 


along the abscissa. 


considered to be plotted to the same 
scale, but each to have a different 
origin on the abscissa. 

Curve No. 1 obtained from the data 
associated with the first rest indicates, 
of course, that the difference between 
the rest groups and the control 
increases progressively with increasing 
rest, and the remaining curves, like- 
wise, suggest a similar interpretation. 
The curve for Rest 1 is markedly 
different from the others with respect 
to ordinal position. The last eight 
curves are all fairly similarly located 
along the ordinate, though there is a 
small tendency for the difference 
scores to increase in magnitude. 
After Trial 2, then, the experimental 
groups are slowly becoming more 
different from the control group. 

In summary, the data of Fig. 4 
suggest that the greatest change in 
difference scores occurs during the 


second rest period, and that thereafter 
there are but slight increases in differ- 
ence scores. The data warrant the 
hypothesis that a rather rapid pacing 
adjustment takes place within each 
group, such that after one or two 
brief rests the successive differences 
between rest and no-rest groups in- 
crease quite slowly. When means 
for each trial (30 sec.) are plotted 
against successive trials, most of the 
difference between groups has been 
established by Trial 2, corroborating 
the hypothesis of a rapid pacing 
adjustment. 


Discussion 


In the present study, the amount 
recovered for Rest 1 was an increasing 
function of the rest duration (as found 
by Kimble and Horenstein [8] and 
others for learning tasks). This finding 
was also typical of the earlier rest periods, 


—=_ 





202 EDWARD A. BILODEAU 


and is interpretable as consistent with 
the notion that the longer the rest the 
more the dissipation of the decremental 
component. After additional rest pe- 
riods, the longer resting groups decreased 
in amounts recovered whereas shorter 
resting groups increased in amounts re- 
covered. A replication of two groups, 
but with additional rest periods, showed 
that during the additional rests the 
amount recovered was independent of 
rest duration. The bulk of the data 
suggests that with a sufficient number of 
rests it might be possible to obtain 
amounts of recovery that are inde- 
pendent of rest duration (for all rest 
durations similar to those used in the 
main experiment). If the amounts re- 
covered are equal, it is likely that the 
Ip’s between groups are unequal as 
rest begins (the rest durations are un- 
equal). Further, if the decay of Jp is 
an exponential function of time, then 
the longest rest group has the least Jz 
available for dissipation. The data sug- 
gested that all groups were similar in 
the progressive decrease in starting 
rate as a function of rest, but that the 
longer rest groups differed from the 
shorter rest groups in that the terminal 
rate remained fairly constant as a func- 
tion of trials. It is the foregoing trends 
in starting and terminal rates which ac- 
count for the observed differences in 
recovery as a function of successive rests 
when degree of rest is the parameter. 
In other words, the progressive drop in 
recovery for the longer rest groups is 
largely attributable to decrement in 
starting rate, and the increase in re- 
covery for the short rest groups is pri- 
marily attributable to relatively greater 
decrements in terminal rates. 

It may be speculated that the shorter 
rest groups are accumulating inhibition 
faster than its dissipation, and that the 
increasing amount dissipated represents 
the exponential decay process proposed 
by Hull (4). On the other hand, it may 
be that the longer rest groups accrue 
more inhibition per trial, but are able 
to dissipate nearly all the inhibition 
during the early rest periods, but a de- 
creasing amount over later rest periods 
because (a) the accumulation of dis- 


sipatable inhibition lessens, or (4) the 
rate of decay decreases with successive 
rests. The present data, however, do 
not indicate why the accumulation 
lessens. Of several possible reasons 
available to account for the progressively 
decreasing amount of recovery, two may 
be worth mentioning: a relatively per- 
manent fatigue residual may be accumu- 
lating as a function of successive dis- 
sipations of Jz (Hull’s concept of s/z),* 
or there may be a decrease in the toler- 
ance for inhibition (7), an event much 
like a decreasing drive to do work. 
There is, of course, some communality 
between the two accounts above, but 
neither is yet sufficiently defined to 
provide the betterexplanation. Because 
of the lack of specificity of the two con- 
cepts, it may be wisest to defer the dis- 
cussion until later. 

The replication of two groups showed 
that the amount recovered is unequal 
at early rests, becomes equal at an inter- 
mediate rest, and that thereafter the 
amount recovered declines at equal 
rates. The latter finding is consistent 
with Hull and Kimble in that both pre- 
dict a progressively decreasing capacity 
to recover from the decremental conse- 
quences of responding. The decreasing 
capacity is characteristic of doth groups, 
however, only after intermediate num- 
bers of rests. 

The data showed that it was possible to 
predict the initial absolute response 
rate after several rests by knowing the 
rate of the no-rest control and the 
amount of recovery after a relatively 
early rest. After two rests most of the 
difference between rest and no-rest 


* Hull has partially anticipated these results 
in his treatment of successive extinctions of the 
same reaction potential. In this situation he 
predicts that the amount of spontaneous re- 
covery will progressively diminish when the 
reaction tendency is subjected to massed evoca- 
tions at uniform intervals. This prediction 
follows from his views on s/z. However, Hull 
did not specify how the phenomena of successive 
extinctions might interact with level of distribu- 
tion of practice. Possibly, the question was 
avoided because of prior failure to make clear 
whether sz can be generated during practice, 
as opposed to rest. Since its inception, however, 
the notion of g/z has never been widely accepted. 








os Woe: 





6 

















REPETITIVE MOTOR TASK 203 


groups has occurred. In other words, 
after the initial period of adjustment, 
much (but not all) of the score variance 
can be accounted for without recourse to 
differential accumulating effects of suc- 
cessive interpolated rests or changing 
effects of differential rests. This con- 
clusion is much the same as one reached 
by Reynolds and I. McD. Bilodeau (9) 
for psychomotor /earning tasks and by 
E. A. Bilodeau for work tasks (1, 2). 

Previous studies on crankifig behavior 
have led to the hypothesis that under 
self-paced conditions of practice and 
work against different loads, rates of 
responding are evoked which are con- 
sistent with a common fatigue state. 
Thus in one study (1), the load used in 
prior practice accounted for little of the 
variance in present practice with a dif- 
ferent load. Further, the amount re- 
covered was independent of the loads 
investigated. In a second study with 
the crank (2), the experimental variable 
was distribution of practice and practice 
was given on each of ten days. The 
results indicated no between-group dif- 
ferences associated with the distribution 
of practice on preceding days, and further 
led to the conclusions that (a) the initial 
rate of response decreased but slightly as 
a function of day of practice, and (é) 
the spaced group, when given a relatively 
brief rest, recovered to the same level 
on each day of practice. Thus, all three 
studies of manual cranking are in accord 
in showing that performance is largely 
determined by the conditions of practice 
existing at the moment. 


SUMMARY 


Five groups of Ss practiced cranking as fast 
as possible for ten 30-sec. trials. Practice trials 
were distributed by interpolating either 0, 10, 
30, 90, or 180 sec. of rest between trials. 

In general, rate of cranking decreased as a 
function of practice time or of number of previ- 
ous practice trials. At all points each rest group 
was appropriately ranked from that with longest 
interpolated rest to that with least interpolated 
rest. Examination of recovery revealed the 
typical rest recovery function; that is, the rest 
recovery associated with Rest 1 was an increasing 
and negatively accelerated function of rest 
duration. With successive rest periods, how- 
ever, the amount of recovery increased pro- 


gressively for the shorter interpolated rest 
periods, but decreased progressively for the 
longer resting periods. This trend continued 
until two of the rest groups had exchanged their 
relative ranks. An extrapolation beyond the 
last rest period suggested a point at which re- 
covery might be independent of all four rest 
durations. The independence after additional 
rests was verified for groups with 10 and 90 sec. 
of rest by replicating these groups, but using 
twice the number of rests. 

A comparison of the differences between the 
various resting conditions and the no-rest control 
group showed the greatest difference between 
groups to occur late in practice, but the most 
marked adjustment in difference between groups 
to take place early in practice. This analysis, 
together with others, suggested that during the 
initial period of practice there is a rapid adjust- 
ment in rate of cranking, and that with succes- 
sive additional periods of practice and rest there 
is but a slight further adjustment in the pacing 
effect. 


REFERENCES 


1. Brropeau, E. A. Decrements and recovery 
from decrements in a simple work task 
with variation in force requirements at 
different stages of practice. J. exp. Psy- 
chol., 1952, 44, 96-100. 

2. Brropgau, E. A. Massing and spacing phe- 
nomena as functions of prolonged and 
extended practice. J. exp. Psychol., 1952, 
44, 108-113. 

3. Frrts, P. M. Perséveration of non-rewarded 
behavior in relation to food-deprivation 
and work-requirement. J. genet. Psy- 
chol., 1940, 57, 165-191. 

4. Hutt, C. L. Principles of behavior. New 
York: D. Appleton-Century, 1943. 

5. Kimsre, G. A. An experimental test of a 
two-factor theory of inhibition. J. exp. 
Psychol., 1949, 39, 15-23. 

6. Kiwste, G. A. Performance and reminis- 
cence in motor learning as a function of 
the degree of distribution of practice. /. 
exp. Psychol., 1949, 39, 500-510. 

7. Kuwsre, G. A. Evidence for the role of 
motivation in determining the amount of 
reminiscence in pursuit rotor learning. 
J. exp. Psychol., 1950, 40, 248-253. 

8. Kimste, G. A., & Horenstern, B.R. Remi- 
niscence in motor learning as a function 
of length of interpolated rest. J. exp. 
Psychol., 1948, 38, 239-244. 

9. Reynotps, B.,& Brtopeau,I.McD. Acqui- 
sition and retention of three psychomotor 
tests as a function of distribution of 
practice during acquisition. J. exp. Psy- 
chol., 1952, 44, 19-26. 


(Received January 19, 1954) 











Journal of 
Vol. 48, No. 3, 1954 


Experimental Psychology 





THE RELATIONSHIP OF CONVERGENCE AND ELEVATION 
CHANGES TO JUDGMENTS OF SIZE 


THOMAS G. HERMANS 
University of Washington 


Interest in the present study stems 
from a study reported in 1937 (6) on 
the relationship of convergence to the 
phenomenon of visual size constancy, 
and from measurements reported in 
1943 (7) of torsional movements that 
occur with changes in ocular move- 
ments of convergence and elevation. 
The fact, readily demonstrable, that 
perceived size is intimately associated 
with changes in convergence is cur- 
rently disregarded in the commonly 
mentioned yet unexplained phenom- 
enon of visual size constancy. The 
fact of decrease in perceived size with 
increase in convergence, coupled with 
the nice three-dimensional relation- 
ship obtained between the amount of 
torsion and the independent variables 
of convergence and elevation sug- 
gested that comparable values of 
changes in perceived size might be 
obtained with changes in convergence 
and elevation which might account 
for the moon illusion. 

The possibility that kinesthetic cues 
from ocular movements may be 
determiners of the visual perception of 
size is almost completely neglected in 
current literature. Graham (5) is 
about the only source currently to 
suggest that convergence and eleva- 
tion cues might be exploited as param- 
eters in space discriminations. It is 
true that Luneberg (8) and Gilinsky 
(4) have recognized convergence as a 
determiner of size perception, but 
they were not interested in the evalua- 
tion of this factor. Graham deplores 
the dearth of analytic data that are 
required as an adequate.basis for 
theorizing about size and shape, and 
concludes, “Above all, we shall be 
interested in the problem of functional 


relations between stimulus variables 
and responses rather than in further 
demonstrational experiments on un- 
specified stimulus constellations” (5, 
p. 877). 

Boring and Gibson, who might be 
regarded the two most authoritative 
sources in the field of visual percep- 
tion, neglect and deny respectively 
the efficacy of kinesthetic cues. Bor- 
ing’s work is to me illustrative of 
“demonstrational experiments on un- 
specified stimulus constellations.” 
We find in Boring (1946) an assertion 
of the constancies, that constancy 
“depends upon an integrative pro- 
perty of the brain and is not a function 
of sense organs at all” (1, p. 99). Of 
size constancy he says, “The brain 
corrects the perception that depends 
initially upon the size of the retinal 
image, corrects it in accordance with 
other sensory data that indicate the 
distance from which the retinal image 
is projected” (1, p. 99). From his 
extensive studies of the moon illusion, 
Boring (2, 1948) concludes that it is 
dependent upon the position of the 
eyes in the head—‘“the raised-eyes 
hypothesis”—yet stops short of nam- 
ing kinesthetic cues, saying, “Now 
you want to know why raising the eyes 
shrinks the moon, but no one has yet 
been clever enough to formulate for 
test the crucial hypothesis that will 
answer the question” (2, p. 16). 
Gibson disposes of accommodation 
and convergence as cues for distance 
by citing Woodworth (10) as author- 
ity for the statement, “Present evi- 
dence, however, makes it doubtful 
that they furnish any data for depth 
perception” (3, p. 111). Gibson says 


that to assume convergence as a cue 
204 











ON 








JUDGMENTS OF SIZE 205 


to depth would involve assumptions of 
trigonometric processes in the brain 
too complex to be considered, and that 
it is senseless to explain depth by the 
adjustment of the eye since “Accom- 
modation and convergence are re- 
sponses of the eyes to a condition of 
their images (blur and disparity) 
which may concurrently produce that 
inner response we call depth” (3, p. 
111). Since he does not mention these 
factors in conjunction with his dis- 
cussion of size perception, we can 
assume only that these statements 
hold for size. 


MetTHoD 


Apparatus.—The apparatus used in this study 
is the same as used and described in the previous 
study of torsion (7) with the exception of change 
in the nature of the targets for stereoscopic 
fusion. Description of it will here be confined 
to the minimum essentials. In a 4 ft. long box, 
divided vertically throughout its length by a 
partition, were placed at the remote end from O 
two lighted square apertures with their diagonals 
in the vertical and horizontal positions. These 
apertures were 80 mm. square and constant in 
size and distance from O. A éalibrated tele- 
stereoscope mounted at the other end of the box 
enabled the stereoscopic fusion of the apertures. 
The outer mirrors of the telestereoscope were 
designed for synchronous rotation in opposite 
directions, thus enabling changes in convergence 
while fusing the apertures. The distance be- 
tween the apertures was such that O with an 
interpupillary distance of 63 mm. would have 
parallel axes of vision when fusing the apertures 
stereoscopically with the calibrations set at 0°, 
that is, with the surfaces of the two mirrors on 
each side parallel. This distance of 63 mm. is 
assumed to be the average interpupillary dis- 
tance for the Os used, on the basis of measure- 
ment of 104 Os used in the study of torsion. 
Changes in elevation were effected by rotation of 
the box on an axis approximating the axis of 
rotation of O’s eyes when moving in elevation. 

A variable aperture whereby Os indicated 
their judgments of size was placed at the end of 
another box also 4 ft. long. By cutting right- 
angle V’s in the sides of two sheets of shim brass, 
and by sliding the open sides of the V’s together, 
a variable aperture similar in shape to the stand- 
ard was formed. The movement of the brass 
sheets was accomplished by suspending them at 
their upper edge from steel bars driven in opposi- 
tion by rack and gear. The gear was fastened 
to one end of a steel rod extending the length of 


the box with a knob at the other end. By turn- 
ing the knob O could vary the size of the aperture 
from a 200-mm. square to practically 0. A 
millimeter scale mounted on the bar suspending 
one of the brass sheets enabled £ to take readings 
of the aperture size as adjusted. The intensity 
of illumination of the standard and variable 
apertures was equated and reduced by blue glass 
to minimize possible effects of afterimages. 
Both boxes were lined with black velvet. 

Procedure.—Forty-nine college students with 
no known visual defect served as Os. Each O 
was told that there was no correct judgment to 
be made, but that Z wanted to know only what 
size the fused apertures appeared to him at each 
successive adjustment of the apparatus. Ap- 
parent distance changes that might result from 
changes in convergence were not mentioned by 
E, but when Os volunteered comment on this 
phenomenon, they were told to disregard it and 
attend only to size. Apparent distance changes 
under these conditions constitute another prob- 
lem in visual perception and will be discussed in 
the evaluation of the present findings. 

The O sat on a stool that was adjustable in 
height, and he could change from looking at the 
standard apertures to the variable aperture with 
about a 60° turn on the stool. He was instructed 
to keep his head in that erect posture used in 
looking horizontally when the box of the stand- 
ard apertures was lowered or raised in elevation; 
that he was to adjust to change in elevation by 
moving his eyes but not his head. The E£ 
checked this posture on each change in elevation. 
A head frame of the ordinary stereoscope was 
mounted on each box against which O rested his 
head. No other device was used to guarantee 
maintenance of the proper head position. 

Successive adjustments of the apparatus were 
effected by E through six steps in convergence 
from 0 to 10°, and five steps in elevation from 
40° below horizontal to 40° above horizontal. 
The convergence in degrees refers to the angular 
change from parallel axes of vision imposed on 
each eye. The O indicated his judgments of size 
by manually adjusting the variable aperture 
which was maintained in the horizontal position. 
Each O started with the standard in the hori- 
zontal position and with 0° of convergence. 
Judgments were made through the successively 
increasing steps in convergence and the standard 
then shifted to a different angle of elevation. 
The sequence of adjustments in elevation was 
systematically varied for the 49 Os. 

The use of the telestereoscope furnished an 
essential check on O’s performance of conver- 
gence, for if he did not change convergence with 
the imposed change in the mirrors, he would see 
a double aperture. He was told to make judg- 
ments of size only while holding the fusion of the 
two, and to report whenever he saw two aper- 





206 THOMAS G. 


tures. At each imposed position of convergence 
and elevation he was told to judge the apparent 
size and turn immediately to the variable aper- 
ture to indicate his judgment, the aperture 
having been changed by E to a size much larger 
or much smaller than the expected size judgment 
by O. Half of the judgments indicated by each 
O were made by increasing the size of the vari- 
able aperture and half by decreasing its size as 
found on turning to it. The O made only one 
judgment for each combination of the two 
independent variables. 

Adjustment of O’s eyes in elevation was 
readily assumed, but the assumption of a given 
degree of convergence was not. Thus, on turn- 
ing from indication of perceived size on the 
variable aperture back to the standard, O might 
see double, particularly at positions of the 
mirrors requiring extreme convergence. When 
this occurred, and it was always checked by E, 
the mirrors were turned toward the position for 
parallel axes until stereoscopic fusion again oc- 
curred, and then while holding the fusion, they 
were turned to the next succeeding degree of 
convergence for the next judgment. Thus judg- 
ments of size were made under conditions of a 
relatively dynamic convergence, but static con- 
ditions of elevation, even though O was told to 
judge the size only after the desired position of 
convergence had been achieved. This point will 
be considered in the evaluation of the data 
obtained. 





HERMANS 


ReEsuLts anD Discussion 


The data derived from these ob- 
servations are summarized in Table 1. 
The first number in each cell of the 
table is the mean value of 49 judg- 
ments, and the second number is the 
standard deviation of judgments from 
the mean. The values of size read 
from the scale on the variable aper- 
ture are in millimeters. Each reading 
was half the diagonal of the aperture 
and when it was adjusted to 56.5 mm., 
it was equal in size to the standard 
apertures. The third number in each 
cell is the theoretical value in degrees 
of torsional adjustment of the eyes for 
that combination of convergence and 
elevation. The derivation of these 
values from obtained values of torsion 
was reported in 1943 (7). The rea- 
son for their inclusion in this table will 
be discussed presently. 

An analysis of variance of the 
individual judgments of size shown in 
Table 2 demonstrates conclusively 
that convergence is a very significant 


TABLE 1 


Tue Retation or ConveRGENCE AND ExLevation CHANGES TO JUDGMENTS OF SIZE 
(N = 49 for each cell) 





























jet Bist Degree of Convergence 
- rs 
Elevation Mean 
o° 2° 4° 6° 8° 10° 
55.73 41.51 30.04 21.51 15.51 11.04 
+40° 19.2 10.7 9.4 7.1 6.2 4.9 29.22 
.20 48 81 1.18 1.60 2.06 
56.61 43.63 31.46 22.93 16.24 12.71 
+20° 20.5 9.1 8.2 7.5 6.7 6.0 30.59 
.13 35 62 .93 1.29 1.70 
57.55 43.81 31.36 22.85 16.55 12.57 
0° 19.6 94 6.8 6.8 6.2 6.3 30.78 
07 ae 44 70 1.00 1.34 
56.06 42.71 30.95 23.30 17.12 12.73 
—20° 19.0 8.6 7.9 7.0 6.4 6.0 30.47 
.02 12 | 47 71 1.00 
56.53 42.24 29.91 22.24 16.28 11.69 
— 40° 18.6 8.0 a 7.9 1 5.7 29.81 
— .03 02 oka 25 3 66 
Mean 56.49 42.78 30.74 22.56 16.34 12.14 









































JUDGMENTS OF SIZE 




















TABLE 2 
Anatysis or VARIANCE OF Raw Data 
Source of Variance — df F 
Elevation 124.45 | 4 3.56*t 
Convergence 69884.51 5 | 24.03*t 
Individual 1509.61 | 48 |130.4* 
Convergence 
X Individual 290.80 |240 | 25.1* 
Elevation 
X Individual 34.68 |192 | 3.0* 
Elevation : 
X Convergence 8.4 | 20 7 
Elevation 
X Convergence 
X Individual 11.58 |960 
ene eey ac ie x Indi PKs Fakta 
ag 4% X Individual. 
determiner of judged size. The angle 


of elevation between the line of regard 
and the horizontal is also found to be a 
highly significant factor in the deter- 
mination of judged size. 

It appears that in the experimental 
findings reported here, coupled with 
the previously reported values of 
torsional movements shown to be 
intimately associated with changes in 
elevation and convergence, we have 
the key to the explanation of the moon 
illusion. It is conditioned on ocular 
movements in elevation and torsion. 
Although convergence movements do 
not occur in the moon illusion, since 
the point of fixation is infinity, the 
other movements of elevation and 
torsion, which have been concomitant 
variables with convergence in many 
other situations, yield sufficient cues 
to condition the perceptual changes in 
size occurring in the moon illusion. 

One might question whether the 
amount of change in size reported in 
this experiment is sufficient to account 
for changes in size that supposedly 
occur in the moon illusion. The 
difference in the mean (linear) values 
between horizontal and + 40°, with 
0° of convergence, is a decrease of 
only 1.82 mm. The decrease in area 
is about 6%. The decreases in area 





207 


reported by Schur (9), to which 
Boring took no exception, were about 
30%. 

In meeting this question, one must 
consider the fact that the distance of 
4 ft. in the present experiment is much 
less than the distances involved in the 
experiments of Schur and Boring. 
If the observations were repeated with 
greater distances, different values 
might be obtained. However, a less 
speculative point to be considered is 
the fact that neither Schur’s nor 
Boring’s studies represent a normative 
approach, while the present study does 
approach such standards. To meet 
the above question, one must con- 
sider the variability of the 49 Os in 
regard to differences in size judgments 
between the two positions. The cor- 
relation of size judgments for the two 
positions was +.89, which is some- 
what indicative of the reliability of 
the Os, but, as one could expect, some 
Os indicated a decrease in apparent 
size on looking upward, while others 
indicated an increase. The range of 
differences in judgments was from a 
decrease of 22 mm. to an increase of 
16 mm., and the standard deviation of 
differences from the mean decrease of 
1.82 was found to be 7.74 mm. A 
decrease of one sigma from the mean 
decrease (1.82 + 7.74) would repre- 
sent a decrease (from the mean hori- 
zontal judgment) of 28% in area, a 
figure that is quite comparable to 
values reported by Schur. 

Different values might also have 
been obtained with an improvement 
in the apparatus. The point was 
made that the adjustment of Os’ eyes 
in convergence was arrived at dynam- 
ically while adjustments in elevation 
were relatively static. Synchronously 
driven servos for both elevation and 
convergence changes would enable 
observations of size to be made at the 
simultaneous termination of both 


movements, each judgment preceded 











208 THOMAS G. HERMANS 


by movement from the position of 
zero convergence and zero elevation. 
It is plausible that perceived size 
might be different immediately follow- 
ing movement than it would be some- 
time after the movement has occurred. 
The plausibility of this hunch is en- 
hanced by the observation of the very 
vivid shrinking in apparent size while 
the moving mirrors of the telestereo- 
scope are imposing increasing con- 
vergence while stereoscopically fusing 
the apertures. 

This very vivid and dominating 
experience of shrinking of apparent 
size with increasing convergence is 
involved in the other problem men- 
tioned regarding changes in apparent 
distance. In this study judgments of 
size were requested, not judgments of 
distance. I am quite confident that if 
size had not been mentioned, all Os 
would have agreed upon a decrease in 
size with increase in convergence. 
On the basis of volunteered, uninvited 
comment, some Os apparently re- 
sponded to decrease in apparent size 
by reporting an increase in apparent 
distance, while others responded to 
increase in convergence by reporting 
decrease in apparent distance. It 
remains to be determined what naive, 
uninstructed Os would report of both 
size and distance under conditions of 
actual changes only in convergence. 
On asking a few sophisticated gradu- 
ate students to report their observa- 
tions on this point, I found agreement 
on size changes but disagreement on 
changes in apparent distance with 
increase in convergence. I can per- 
ceive either decrease or increase in 
distance depending upon whether my 
attention is upon muscular action 
occurring or upon the phenomenal size 
change, respectively. 


SUMMARY AND ConcLusions 


Observation of the readily demonstrable phe- 
nomenon of decrease in apparent size with 


increase in convergence, the continued disregard 
of kinesthetic cues from ocular movements in 
current literature on visual size perception, and 
the nice relationship obtained between torsion 
and the other ocular movements of convergence 
and elevation stimulated the present study. 
Stereoscopically fused lighted apertures with 
objective size and distance held constant were 
judged as to apparent size by 49 Os under condi- 
tions of systematic changes in convergence and 
elevation. An analysis of variance of the data 
derived indicated that both elevation and con- 
vergence of the visual axes are significant de- 
terminers of the perception of size. 

The extreme significance of convergence in the 
visual perception of size is regarded as proof that 
theorists on visual size constancy cannot con- 
tinue to disregard, as they have, cues of con- 
vergence as determiners of the phenomenon. 

The significance of elevation coupled with the 
previously established fact of torsion as a con- 
comitant variable of other ocular movements is 
regarded as the key to the explanation of the 
moon illusion. The illusion is conditioned upon 
kinesthetic cues from ocular movements in tor- 
sion and elevation. 


REFERENCES 


1. Bortnc, E. G. Perception of objects. 
Amer. J. Phys., 1946, 14, 99-107. 

2. Bortnc, E. G. The nature of psychology. 
In E. G. Boring, H. S. Langfeld, & H. P. 
Weld (Eds.), Foundations of psychology. 
New York: Wiley, 1948. Pp. 1-18. 

3. Gipson, J. J. The perception of the visual 
world. New York: Houghton Mifflin, 
1950. 

4. Gitrnsxy, A.S. Perceived size and distance 
in visual space. Psychol. Rev., 1951, 58, 
460-482. 

5. Granam, C. H. Visual perception. In S. 
S. Stevens (Ed.), 4 handbook of experi- 
mental psychology. New York: Wiley, 
1951. Pp. 868-920. 

6. Hermans, T.G. Visual size constancy as a 
function of convergence. /. exp. Psy- 
chol., 1937, 21, 145-161. 

7. Hermans, T. G. Torsion in persons with 
no known eye defect. J. exp. Psychol., 
1943, 32, 307-324. 

8. Luneperc, R. K. Mathematical analysis 
of binocular vision. (For the Dartmouth 
Eye Institute.) Princeton, N. J.: Prince- 
ton Univ. Press, 1947. 

9. Scuur, E. Mondtauschung und Sehgross- 
enkonstanz. Psychol. Forsch., 1926, 7, 
40-80. 

10. Woopworth, R. S. Experimental psychol- 
ogy. New York: Holt, 1938. 


(Received February 2, 1954) 








iti a ena 














Journal of + Psychology 
Vol. 48, No. 3, 1954 


A NOTE ON THE AUBERT PHENOMENON 


CARL IVAR SANDSTROM 
University of Stockholm, Sweden 


In a paper published in this journal 
and belonging to the experiments on 
sensory-tonic field theory of percep- 
tion, Wapner and Werner (6) have 
studied the kinesthetic perception of 
verticality. The present writer has 
published, independently of these 
authors, a study in which he uses a 
very similar technique (4). The 
problem is how one adjusts a pivoting 
rod to the apparent upright by using 
the hands as tactual-kinesthetic media 
in an absolutely dark room (or, as in 
Wapner and Werner’s study, blind- 
folded). The interest is concentrated 
on the task performed with head later- 
ally tilted, and on the question whe- 
ther or not the results obtained under 
such conditions are in agreement with 
those reported for visual perception of 
verticality. Wapner and Werner’s 
Ss worked with sidewise tilt of the 
head at an angle of 45° (respectively 
to the right and left), our Ss with a 
tilt of 30°. 


The results of the two different studies 
are very similar. Probably on account 
of the greater angle, Wapner and Werner 
obtained higher values of deviation. 
Both studies show the remarkable, but 
as far as we know unexplained, fact that 
the deviations from the true upright with 
head tilted to the left are less than with 
head tilted to the right. Essential for 
the following discussion is that the effect 
of the tilt of the head generally consists 
in an adjustment of the rod in the oppo- 
site direction to the tilt of the head; that 
is, a rod in an upright position is tactual- 
kinesthetically perceived as tilting in 
the same direction as the tilt of the head. 
These results are further confirmed in a 
later study by us (5) on sex differences. 
(Ninety Ss of each sex were tested. It 
may be mentioned that a sex difference 
was found with respect to the minus 
deviations with head tilted to the right. 


They were significantly greater for the 
women.) 

Does this agree with the results for 
visual perception of verticality? The 
basic study is Aubert’s (1), who de- 
clared that a true vertical luminous line 
observed in a dark room with head tilted 
sidewise appears to be displaced in the 
opposite quadrant. (This is the well- 
known Aubert phenomenon or 4 phe- 
nomenon.) This implies, of course, that 
the luminous line, in order to be per- 
ceived as upright, must be adjusted to 
the same quadrant as the tilt of the head. 
Wapner and Werner assert generally 
and with emphasis that the significance 
of their study “is the demonstration 
that for kinesthetic perception of the 
vertical results have been obtained essen- 
tially identical in nature with those 
previously reported for visual perception 
of the vertical” (6, p. 129). This inter- 
pretation of their results is retained in 
a later paper (7, p. 293). We have in- 
terpreted the results in a quite different 
way. 

From an earlier paper by Werner, 
Wapner, and Chandler dealing with 
visual perception (8), it is clear that from 
their point of view the tactual-kinesthetic 
results are not misinterpreted. This 
paper on visual perception, however, is 
not referred to in the study under discus- 
sion (6) in spite of its great relevance. 
Thus, the authors have related their 
tactual-kinesthetic results to the so-called 
E phenomenon first discussed by G. E. 
Miller (3). He called attention to the 
fact that with small tilts of the head a 
vertical luminous line often appears to 
be displaced in the same quadrant as the 
tilt of the head. Miiller says: “Ist 
die Kopfneigung nicht sehr ausgiebige, 
so wird Adufig das Gegenteil des A- 
Phanomen beobachtet” (3, p. 110, 
italics mine). 

Wapner and Werner, like Witkin and 
Asch (9, p. 607), are speaking of the 
E phenomenon as having the same 
generality as the 4 phenomenon. In 


209 








~, Jen eee 





210 CARL IVAR SANDSTROM 


their earlier paper on visual perception 
(8), their Table 1 (showing mean posi- 
tions of visually apparent vertical with 
a body tilt of 15° and 30°) stands out 
as very surprising to us in respect both 
to the size and to the unambiguousness 
of the deviations (8, p. 347). The re- 
sults of the tilts of the head to the right 
and left by Witkin and Asch are not 
kept apart and are therefore difficult to 
use in this respect. Witkin and Asch 
say in a note: “The E-effect is stronger 
when the body is to the right than when 
it is to the left of the upright. With 
body tilted right, 75.0 per cent of all 
adjustments were to the opposite side; 
with body left, only 45.0 per cent” 
(9, p. 608). They give no mean values 
for the different tilts of the head. So 
it happens and by no means rarely that 
the deviations from the true upright 
fall within the same quadrant, especially 
within the left one, irrespective of the 
direction of the tilt of the head. This is 
evidently what has happened also in the 
experiments by Witkin and Asch. Thus, 
they do not take into consideration the 
general predominance of the negative 
values (cf. the previous mention of the 
fact that the negative mean values are 
greater than the positive ones as regards 
the tactual-kinesthetic experiments; this 
holds good for both types of perception 
under discussion). In regard to visual 
perception the relation between the two 
values thus falling within the same 


quadrant is most often in agreement with: 


the 4 phenomenon: an S may have on 
an average —5° with head tilted to the 
left and —2° with head tilted to the 
right. This of course implies a certain 
difficulty of definition, but in any case 
it is not a question of the E phenomenon. 

Mann (2) reports results for visual 
perception (obtained especially by D. B. 
Jones) exclusively valid for small tilts 
of the head which are in keeping with 
the 4 phenomenon. Neither are we 
able to support Wapner and Werner 
with reference to the generality of the 
E phenomenon within the visual field. 
During ongoing investigations respecting 
sex differences in visual perception of 
verticality we have found cases showing 
clear E effects both at an angle of 30° 


and 45°, but most Ss adjust the luminous 
rod in agreement with the 4 phenom- 
enon. Among 60 Ss (30 of each sex) 33 
Ss show 4 phenomenon, 16 E, 1 EA (E 
at 30° and 4 at 45°) and 10 4E. These 
frequencies confirm mostly Miiiller’s 
opinion. Within the tactual-kinesthetic 
field the 4 phenomenon occurs occasion- 
ally but the E phenomenon is the com- 
mon type. 


REFERENCES 


1. Ausert, H. Eine scheinbare bedeutende 
Drehung von Objecten bei Neigung des 
Kopfes nach rechts oder links. Virchows 
Arch., 1861, 20, 381-393. 

2. Mann, C. W. The effects of auditory- 
vestibular nerve pathology on space per- 
ception. J. exp. Psychol., 1951, 42, 450- 
456. 


3. Miitrer, G. E. Ueber das Aubertsche 
Phanomen. Z. Psychol., 1916, 49 (II), 
109-246. 

4. Sanpstrom, C. I. Taktil-kinestetisk be- 
stamning av lodlinjen med sidolutat 
huvud (Tactual-kinesthetic determina- 
tion of the vertical position of a pivoting 
rod, with tilted head). Nordisk Psy- 
kologi, 1952, 4, 156-165. (English sum- 
mary.) 

5. Sanpstrom, C.I. Konsskillnader vid taktil- 
kinestetisk lodbestamning (Sex differ- 
ences in tactual-kinesthetic determination 
of the vertical). 3. nordiska psykolog- 
motet i Helsingfors 1953 (3. Scand. Congr. 
Psychol., Helsinki, 1953). Helsinki, 1954. 

6. Warner, S.,& Werner, H. Experiments on 
sensory-tonic field theory of perception: 
V. Effect of body status on the kinesthetic 
perception of verticality. J. exp. Psy- 
chol., 1952, 44, 126-131. 

7. Werner, H., Warner, S., & Brue.t, J. H. 
Experiments on sensory-tonic field theory 
of perception: VI. Effect of position of 
head, eyes, and of object on position of 
the apparent median plane. J. exp. 
Psychol., 1953, 46, 293-299. 

8. Werner, H., Warner, S., & CHanpier, K. 
A. Experiments on sensory-tonic field 
theory of perception: II. Effect of sup- 
ported and unsupported tilt of the body 
on the visual perception of verticality. 
J. exp. Psychol., 1951, 42, 346-350. 

9. Wrrxin, H. A., & Ascu, S. E. Studies in 
space orientation. III. Perception of the 
upright in the absence of a visual field. 
J. exp. Psychol., 1948, 38, 603-614. 


(Received December 15, 1953) 











am 


+ toe at ara eee 








Journal of Experimental Psychology 
Vol. 48, No. 3, 1954 


SOME EFFECTS OF PROBLEM COMPLEXITY UPON 
PROBLEM SOLUTION EFFICIENCY IN 
DIFFERENT COMMUNICATION NETS! 


MARVIN E. SHAW 
The Johns Hopkins University 


In an experiment dealing with 
problem solving by small groups as a 
function of the kind of communication 
net in which they were required to 
work, the author (3) obtained results 
inconsistent with those reported by 
Leavitt (2). Leavitt had found that 
the wheel pattern, in which one S was 
placed in a central position, resulted in 
shorter solution times than did the 
circle pattern, in which all Ss were 
placed in equally central positions; 
the opposite relationship was found 
by this experimenter.” 

The two experiments differed in 
several probably important respects, 
such as size of the ‘groups, subject 
population, number of trials, etc. 
From an examination of the ways in 
which Ss attempted to solve the 
problems, however, it seemed likely 
that the difference in results could be 
explained by the variable of problem 
complexity. Leavitt used relatively 
simple problems which merely re- 
quired Ss to identify a symbol held in 
common by all Ss in the group; the 
second experiment used relatively 
more complex problems which re- 
quired that Ss perform simple arith- 
metical computations such as addi- 


1 This experiment was done under Contract 
N5-ori-166, Task Order 1, between the Office of 
Naval Research and The Jchns Hopkins Uni- 
versity. This is report No. 166-I-184, Project 
Designation No. NR_ 145-089, under that 
contract. 

2 These two investigators used different time 
measures, Leavitt using the fastest single correct 
trial and this writer using all trials whether 
correct or incorrect, fast or slow. However, 
when Leavitt’s method was used, the differences 
between the wheel and the circle were still in the 
direction opposite to that reported by Leavitt. 


tion, subtraction, multiplication, and 
division. With the simpler problems 
Ss apparently were willing to accept 
the decision of another S much more 
readily than in the case of the more 
complex problems. Furthermore, 
there was evidence that the e 
pattern was more effective than,the 
wheel when solving complex prob ms 
partly because it allowed more par- 
ticipation by each group member than 
did the wheel pattern. 

The present experiment was de- 
signed to test the hypothesis that a 
communication net in which all Ss 
are in equal positions (the circle) will 
require less time to solve relatively 
complex problems but more time to 
solve relatively simple problems than 
will a communication net in which one 
S is placed in a central position (the 
wheel). 

A secondary purpose of this experi- 
ment was to obtain additional data 
regarding (a) the effects of the vari- 
ables of communication net and 
problem complexity upon the number 
of errors, number of items communi- 
cated, and group morale (or general 
satisfaction), and (b) the usefulness of 
the “independence” measure’ for pre- 
dicting number of communication 
items transmitted by Ss in each posi- 
tion and general satisfaction with the 
group situation of Ss in each position. 


*The independence index is a measure 
developed by this writer (4) which takes into 
account the number of channels available to an 
S in a given position, the number of positions 
for which he must serve as a relayer of informa- 
tion, and the total number of channels in the net 
relative to the number in a totally interconnected 
net. 


211 





he, at kis Ae eee 


_— 











212 


MARVIN 


METHOD 


Apparatus—The apparatus used in this 
experiment was essentially the same as that used 
in an earlier experiment (3). It consisted of a 
room which was partitioned into four cubicles 
(one of which was not used in this experiment). 
Slots were cut in the partitions so that messages 
(written on 3 X 5 cards) could be passed from 
each cubicle to every other cubicle. Curved 
metal channels prevented Ss from seeing into the 
other cubicles. Small rectangular colored areas 
around each slot indicated the cubicle into which 
that slot opened. Different communication nets 
could be arranged by closing some of the con- 
necting slots. A mercury switch mounted on the 
top of the work table in each cubicle controlled 
the timers and signal lights on £’s control panel, 
and permitted each S to signal when he knew the 
answer to the problem. 

Two kinds of problems were used: (a) simple 
problems which required only the identification 
of common symbols, and (b) complex problems 
which required simple arithmetical computa- 
tions. Examples of the simple and complex 
problems are given below. 

A simple problem.—“There are several cards 
with letters printed on them similar to the ones 
which you have. These are distributed among 
the members of this group. There is only one 
letter which appears on at least one of the cards 
held by each member. What is that letter?” 

Items of information: 


()E ADH 
(2)X K LO 
3) J M BE 
(4)P UWF 
6)G R ET 
@N Qs cC 


These cards were distributed in such a way 
that each S held one of the cards having the 
common symbol (in this problem the letter E). 

A complex problem.—“A small company is 
moving from one office building to another. It 
must move three kinds of equipment: (1) chairs, 
(2) desks, and (3) typewriters. How many 
trucks are needed to make the move in one 
trip?” 


3.0 2.0 
1.3 1.3 2.0 2.0 
WHEEL CIRCLE 
Fic. 1. The communication nets. The 


numbers are the “independence” scores for each 
position. 





E. SHAW 


Items of information: 


(1) The company owns a total of 12 type- 
writers. 

(2) The company owns a total of 48 chairs. 

(3) The company owns a total of 12 desks. 

(4) One truck will carry 24 chairs and nothing 
else. 

(5) One truck will carry 3 desks and nothing 
else. 

(6) One truck will carry 12 typewriters and 
nothing else. 


The two nets investigated are shown in Fig. 1. 
These two nets, the wheel and the circle,‘ were 
selected because the critical aspects of their 
patterns correspond to those used by earlier Es, 
although they differ in number of positions. 

Subjects —The Ss were male undergraduates 
enrolled in psychology courses at The Johns 
Hopkins University. They were naive with 
respect to the purpose of this experiment and 
with respect to this type of experiment in 
general. Twenty-four groups of three Ss each 
wererun. The Ss of a given group were usually 
acquainted with each other prior to the experi- 
ment, but they did not know what position was 
occupied by which other S. Each S was used 
only once. 

Procedure—Each group was assigned at 
random to one of four experimental conditions 
representing the possible combinations of the 
two communication nets and the two kinds of 
problems. With the exception of variations in 
communication net and kind of problem, the 
procedure was exactly the same for all groups. 
Each S was brought into the experimental room 
alone and seated in one of the cubicles. When 
all three Ss had been seated, each one was handed 
a mimeographed copy of the instructions. The 
E then read these instructions to the Ss while 
they followed along on the printed page. The 
instructions told the Ss the method of identifying 
group members (i.e., by colors), the ones with 
whom each could communicate, the method of 
communication, and the general nature of the 
tasks. 

Each group was required to solve four 
problems. In each case, each S was given a 
statement of the problem and two of the six 
items of information needed for solution. When 
all four problems had been completed, each S 
was asked to fill out a questionnaire which 
required that he (a) indicate on an eight-point 


4 The particular labels, “wheel” and “circle,” 
are used to prevent confusion when making 
comparisons with results from earlier experi- 
ments; it is recognized that the terms “chain” 
and “interconnected” are perhaps more ap- 
propriate labels for these nets. 

















ll a ee 


COMMUNICATION NETS 213 


TABLE 1 


Means AnD Resutts or Tuxey’s Gap TEsT oF 
THE Trias X Nets X Comp.exity 
INTERACTION OF TimE REQUIRED 
FOR ALL Ss To SoLvE 
THE PROBLEM 

















Simple Problems Complex Problems 
Trials 
Wheel Circle Wheel Circle 
1 3.86T 3.68T 8,80* 7.07* 
2 1.49t | 1.66t | 5.76* | 4.89+ 
3 95t | 1.24t | 4.96t | 4.29t 
4 83t 1.21f 4.50T 3.57 














* Separated by a significant gap as individual means. 

t Separated by a significant gap as a nonhomogeneous 
group. 

t Separated by a significant gap as a homogeneous 
group. 


rating scale how well he enjoyed his job in the 
group, how well he thought the members of the 
group had cooperated with each other, and how 
he would rate the performance of the group, and 
(b) describe the method of solution used by the 
group. 

RESULTS 


Time to solve—Time to solve was 
defined as the time required for all Ss 
to learn the answer; i.e., time meas- 
ured from the “go” signal until the 
last S threw his switch. The results 
of this measure are shown in Table 1, 
and the results of analysis of variance 
are given in Table 2. Two of the 
main variables, Trials and Com- 
plexity, were significant, as was the 
second-order interaction. Examina- 
tion of the data given in these tables 
permits the conclusion that learning 
occurred under all conditions. This 
finding is in agreement with the 
results of previous experiments. It 
also seems reasonable to conclude that 
the more complex problems required 
more time to solve than did the simple 
problems, regardless of other condi- 
tions. 


5 Analysis of the average time to solve (the 
sum of the individual times divided by three) 
gave identical statistical results. Similar results 
were obtained when the measure used was the 
fastest single correct trial. 


The most interesting finding for the 
original hypothesis of this study, how- 
ever, is the significant second-order 
interaction. The outcome of Tukey’s 
(5) gap test applied to the interaction 
means is given in Table 1. This test 
shows that: (a) there are three means 
(8.80, 7.07, and 5.76) which differ 
significantly from all other means and 
from each other; (b) there is one 
group of means (4.96, 4.89, 4.50, 4.29, 
3.86, 3.68, and 3.57) which differ 
significantly from all other means and 
which show over-all internal differ- 
ences when tested by the F ratio; and 
(c) there is one group of means (1.66, 
1.49, 1.24, 1.21, .95, and .83) which 
differ significantly from all other 
means, but do not differ from each 
other. Therefore, we can draw the 
following conclusions: (a) with com- 
plex problems, the circle is faster than 
the wheel on all trials, but the greatest 
difference is found on the first trial; 
and (b) with simple problems, the 
wheel is faster than the circle on three 
of the four trials, but these differences 
are not statistically significant. The 
results support the first part of our 
hypothesis and are in the direction 
predicted by the second part of the 
hypothesis. 


TABLE 2 


Summary or ANALyYsIs OF VARIANCE 
FoR Time To SoLve 














Source of Variation | df ee. F r 
Trials 3} 54.20 | 69.48 | <.001 
Nets 1 4.65*| — 
Complexity 1 | 313.49* | 59.71 | <.001 
Groups within 20 5.25 
Trials X Nets 3 73 
TrialsX Complexity} 3 1.38 | 1.77, — 
Nets X Complexity 1 8.86*| 1.69) — 
Trials X Nets 

Complexity 3 5.17 | 6.63) <.001 
Residual 60 .78 
Total 95 

















* Tested by groups within. 





214 MARVIN E. SHAW 


TABLE 3 


Means AND Resutts or Tukey’s Gap TEstT oF 
NumsBer or Communication Items 
TRANSMITTED 














Simple Problems Complex Problems 
Trials 
Wheel Circle Wheel Circle 
1 9.7t 9.3t 18.0 ae 
2 5.8T 7.5f 16.5f 18.7 
3 5.0T 7.8 14 i 17.3 
4 5.3T 7.8t 10.7 18.8f 

















* Separated by a significant gap as individual means. 
t Separated by a significant gap as a nonhomogeneous 


<n by a significant gap as a homogeneous 
group. . 

Errors—An error was defined as 
any incorrect answer reported by any 
member of the group. Therefore, the 
maximum number of errors possible 
on any one problem was three. With 
simple problems almost no errors 
were made (Mean per problem = .06, 
SD = .08, the values being identical 
for the two nets). With complex 
problems the difference between nets 
was in the same direction as that for 
time to solve, i.e., the wheel made 
more errors (Mean per problem = .83, 
SD = .46) than did the circle (Mean 
per problem = .46, SD = .26). When 
tested by Wilcoxon’s 7 test (6), this 
difference is significant at the .05 level 
of confidence. These results differ 
somewhat from those of earlier 
studies. Differences in size of the 
groups probably account for this 
deviation. 

Communication items transmitted.— 
Any item of information which was 
potentially useful for problem solution 
was considered as one communication 
item. The results of this analysis are 
given in Table 3. Analysis of vari- 
ance showed that the Trials term and 
the Complexity term were both sig- 
nificant (p<.001) as was the Trials 
X Nets X Complexity interaction 
(p<.01). The significant main vari- 
ables indicate that learning occurred 


and that more communication items 
were transmitted during the solution 
of complex problems than with simple 
problems. To facilitate interpreta- 
tion of the significant second-order 
interaction, Tukey’s gap test was 
applied to the interaction means. 
The results of this test are indicated in 
Table 3. These results permit the 
conclusion that on the last trial, at 
least, the circle required that more 
items be transmitted in order to reach 
a solution than did the wheel, and 
that this difference was greater with 
complex than with simple problems. 
Group morale.—The average of the 
ratings of job satisfaction, group co- 
operation, and group performance was 
taken as the measure of group morale. 
The only significant effect was that 
produced by the complexity variable | 
(p<.001). The morale of the groups 


solving simple problems was higher ~ 


(Mean rating = 7.0, SD = .54) than 
the morale of the groups solving com- 
plex problems (Mean rating = 6.0, 
SD = .88). Net differences were in 
the same direction as that found in 
previous studies but were not great 
enough to yield significant F values. 
The same results were obtained when 
the various ratings were considered 
individually rather than together. 


TABLE 4 


Mean Numser or Items TRANSMITTED AND 
Mean Ratincs or Inpivipvat Mora.e 
Accorpinc To DEGREE oF 











INDEPENDENCE 
Simple Complex 
: —¥ naa 
Measure Degree of Degree of 








Number of items 
communicated 


5.7 
Morale 6.6 





























> 
: 
i 





Se 









COMMUNICATION NETS 


TABLE 5 







Summary or ANALYSES OF VARIANCE OF NuMBER oF Items TRANSMITTED 
anp Mora.e Accorpinc To Positions 














Items Communicated Morale 
Source of Variation df 
Mean F Mean F 
Square Square 

Independence 2 822.00 16.88*** 15.45 5.83** 
Complexity 1 3081.13 63.27*** 11.68 4.41* 
Independence X Complexity 2 107.38 2.20 01 
Subjects within 66 48.70 2.65 
Total 71 




















t at .0S level of confidence. 
oo Sigmifcant at .01 level of confidence. 
*** Significant at .001 level of confidence. 


Relationship of independence to num- 
ber of items communicated and to 
individual morale—In Table 4 are 
shown the independence scores for each 
of the three possible positions, the 
mean number of items per problem 
transmitted from each position, and 
the mean rating of morale (or general 
satisfaction) for each position. Since 
in the circle all positions have an 
independence score of 2.0, all members 
of the circle net are in one category. 
Similarly, the two positions of the 
wheel having an independence score 
of 1.3 are combined for this analysis. 
The relationship between independ- 
ence and the two dependent variables 
(items communicated and morale) is 
positive and approximately linear. 
The results of analyses of variance 
(Table 5) show that the differences in 
independence produce significant dif- 
ferences in both items transmitted 
and morale. (The significant Com- 
plexity term is simply a reflection of 
that effect found in earlier analyses.) 

Differences in positions having the 
same independence index (e.g., the 
different positions of the circle) were 
not significantly different, either with 
respect to number of items com- 
municated or with respect to indi- 
vidual morale. Thus, these results 





may be taken as showing a real 
difference between positions which 
differ with respect to independence. 
While these differences can be ex- 
plained on the basis of other differ- 
ences (e.g., individual centrality), 
taken in conjunction with other re- 
sults which have already been pub- 
lished (4) they offer further evidence 
as to the usefulness of the independ- 
ence measure in the prediction of 
amount of communication and morale 
for individuals within groups. 


Discussion 


The results of this experiment gener- 
ally support the original hypothesis that 
a communication pattern which places 
one person in a central position (the 
wheel) will require more time to solve 
relatively complex problems but less time 
to solve relatively simple problems than 
will a communication net which places 
all persons in positions which are equally 
central (the circle). The major devia- 
tion from this hypothesis is that the 
differences in time to solve simple 
problems did not differ significantly for 
the two nets,® although the wheel did 
require less time than the circle on three 


* Applying the ¢ test, we find that the differ- 
ence between the means on the last trial ap- 
proached significance: Obtained t = 2.11; t.0s 
= 2.228. 


216 MARVIN 


of the four trials. Because of the large 
individual differences, this failure may be 
due, in part, to the small number of 
groups in each condition. At any rate, 
we can now draw some fairly general 
conclusions about the effect of the com- 
munication net upon the problem-solving 
performance of small groups. With 
simple problems, the wheel is faster than 
(2), or just as fast as (this study) the 
circle. This conclusion can be general- 
ized to the extent that it holds for both 
three- and five-position nets. With 
complex problems, the circle is faster 
than the wheel. This statement can be 
generalized to the extent that it holds for 
both three- and four-position nets (this 
study, and 3). Thus, we have some 
evidence that size of the group does not 
change these conclusions to any marked 
extent. 

Perhaps we can go a bit further in our 
generalizations. Evidence gained from 
earlier studies involving other communi- 
cation nets (namely, the chain and Y 
patterns in [2] and the slash pattern in 
[3]) suggests that the differences among 
communication patterns are due to the 
availability of information and to the 
possibility of contributions from ll 
members of the group. When simple 
problems are to be solved the availability 
of information is of primary importance. 
All Ss are equally capable of identifying 
common symbols and communication 
demands are less than the maximum 
permitted by the system under the im- 
posed conditions. Thus, with simple 
problems the wheel should be faster than 
the circle because information is just as 
available to the person in the central 
position in the wheel as it is to any one 
position in the circle, and because the 
wheel pattern has the added effect of 
designating which S will perform the 
function of identifying the common sym- 
bol. As the complexity of the problem 
increases, however, the possibility of 
contributions from all members of the 
group becomes much more important. 
This is true because some Ss are more 
capable than others of solving such 
problems quickly, and because part 
solutions can be delegated to various 


E. SHAW 


positions, thereby compensating in part 
for the effects of “‘saturation.”’ With 
complex problems, then, the wheel should 
be slower than the circle because the 
central person becomes saturated (i.e., 
because he must do most of the work, 
either the actual solution or relaying 
information, the optimal output level is 
exceeded), and because it sometimes 
forces the weakest person in the group to 
function in the leadership role. 

It was originally thought that one of 
the reasons for the differences in time to 
solve would be the method used in 
solving the problems. It was expected 
that with the simple problems, the 
method of solution in the wheel would be 
to send all information to the central 
position and let him send out the answer, 
while with the complex problems the 
method would be to send all information 
to all Ss and let each § work out his own 
solution; in the circle it was expected 
that the latter method would be used 
with both types of problems. Actually, 
from an analysis of messages and from 
the Ss’ own reports, the wheel used the 
first method exactly 50% of the time 
regardless of the type of problem; the 
circle behaved as expected. Therefore, 
differences due to the kind of problem 
cannot be attributed to the method of 
solution. 

The finding that the circle requires 
more messages for solution than does the 
wheel is‘in agreement with the results of 
other investigations (2, 3). The previ- 
ous explanation in terms of independence 
of action (2) appears adequate to account 
for this finding. 

Failure to find statistically significant 
differences in group morale as a function 
of the communication net does not agree 
with the results of earlier studies. Total 
difference between the three-position 
wheel and the three-position circle with 
respect to independence is slight and 


™The term “saturation” has been used by 
Gilchrist, Shaw, and Walker (1) to refer to the 
condition which exists when the required number 
of message units passes a certain optimal output 
level, thus producing effects which are counter 
to the effects of a favorable position in the net. 














SESw we Soe oF FF w wwe SE es ee lO ee OD 


OS SS lUaS—C TSS 


= vw 








-— = ee = 





COMMUNICATION NETS 217 


hence would be expected to exert only 
slight influence upon total morale. The 
significant differences in morale as a 
function of task complexity may be 
related to feelings of achievement; i.e., 
the shorter times required for the simple 
problems may have led Ss to believe that 
they had attained a high achievement 
level. This view is supported by the 
fact that the greatest difference produced 
by the task-complexity variable was 
found on the ratings of group perform- 
ance; Ss solving simple problems rated 
the group’s performance much higher 
(Mean = 6.9) than did Ss solving com- 
plex problems (Mean = 5.6). 

In general, the results of this study 
suggest that future experimentation will 
perhaps be more fruitful if more complex 
problems are used, since most problems 
which must be solved by small groups in 
real-life situations are more complex than 
those that have been investigated in the 
laboratory. At the present time the 
evidence seems to show that: (a) a net 
which allows maximum participation by 
all group members is more efficient (in 
terms of time to solve and number of 
errors) than is a net which restricts this 
interaction, and (4) there is more com- 
munication activity and greater satis- 
faction in a net which allows equal 
participation than in one which does not 
allow equal participation by all group 
members. These statements can be 
made with greater confidence when the 
problems to be solved are relatively 
complex than when the problems are 
relatively simple. 


SUMMARY 


Twenty-four groups of three Ss each were run 
to test the hypothesis that a communication net 
in which Ss are placed in equal positions will 
require less time to solve relatively complex 
problems but more time to solve relatively simple 


problems than will a net in which one S is placed 
in a central position. Six groups were assigned 
to each of four conditions: (a) the wheel— 
simple problems, (b) the wheel—complex 
problems, (c) the circle—simple problems, and 
(d) the circle—complex problems. 

The outcome of this experiment generally 
supports the hypothesis. However, the differ- 
ences in times required to solve the simple 
problems by the Ss in the two nets, although in 
the expected direction, failed to reach statistical 
significance. 

A secondary purpose was to collect additional 
data regarding (a) the effects of the variables of 
communication net and problem complexity 
upon number of errors, number of items com- 
municated, and group morale, and (6) the use- 
fulness of the independence measure for pre- 
dicting number of items communicated by Ss 
in each position and general satisfaction with the 
group situation of Ss in each position. These 
data were largely in agreement with the results 
of other investigations. 


REFERENCES 


1. Giucurist, J. C., Saaw, M. E., & WALKER, 
L. C. Some effects of unequal distribu- 
tion of information in a wheel group 
structure. J. abnorm. soc. Psychol., in 
press. 

2. Leavitt, H. J. Some effects of certain com- 
munication patterns on group perform- 
ance. J. abnorm. soc. Psychol., 1951, 46, 
38-50. 

3. Suaw, M. E. Some effects of unequal dis- 
tribution of information upon group 
performance in various communication 
nets. J. abnorm. soc. Psychol., in press. 

4. Suaw, M. E. Group structure and the be- 
havior of individuals in small groups. 
J. Psychol., 1954, 38, 139-149. 

5. Tuxey, J. W. Comparing individual means 
in the analysis of variance. Biometrics, 
1949, 5, 99-114. 

6. Witcoxon, F. Some rapid approximate 
statistical procedures. Stamford, Conn.: 
American Cyanamid Co., 1949. 


(Received for early publication 
April 23, 1954) 





Journal of + oe Psychology 
Vol. 48, No. 3, 1954 


CONTEXT EFFECTS AND THE VALIDITY 
OF LOUDNESS SCALES! 


W. R. GARNER 
The Johns Hopkins University 


Recent research on loudness-scaling 
methodology (3, 4) has suggested that 
the average O is not able properly to 
describe magnitude relations between 
tones of different loudnesses if he must 
describe the relations on a numerical 
scale, particularly one with ratio 
properties. The difficulty probably 
lies in O’s inability to align his con- 
ceptual scale of numbers (whose prop- 
erties he may understand perfectly) 
with his subjective scale of loudness, 
according to the rules of alignment 
which E sets for him. It is quite 
possible that O has a perfectly good 
magnitude scale of loudness in his 
sensory system. Experiments have 
indicated, however, that he has diffi- 
culty describing the numerical prop- 
erties of this sensory scale by using a 
conceptual scale of numbers. 

Most psychological scales of sensory 
magnitude have been constructed 
from data for which O was required to 
produce a stated numerical relation, 
to state what numerical relation held 
between two magnitudes, or to state 
that a pair of stimuli did or did not 
satisfy the requirements of a given 
numerical ratio. Since with all of 
these so-called direct scaling methods, 
O is required to use or at least to 
understand the relation between his 
conceptual scale of number and his 
sensed magnitudes, there is a serious 
question about the validity of psycho- 


1This research was done under Contract 
N5-ori-166, Task Order 1, between the Office of 
Naval Research and The Johns Hopkins Uni- 
versity. This is Report No. 166-I-187, Project 
Designation No. NR 145-089, under that con- 
tract. The author is indebted to Miss Frances 
Wolfram for assistance in the collection and 
tabulation of the data. 


logical scales constructed from such 
data. Psychological scales con- 
structed with these direct scaling tech- 
niques have a kind of face validity 
which is desirable, since the scales are 
constructed to satisfy the require- 
ments of the numerical statements 
made by Os. As will be seen, how- 
ever, this face validity may be illusory. 

One way to determine the meaning- 
fulness of such scaling techniques is 
to determine the extent to which the 
judgments of O can be influenced by 
the context of stimuli presented to 
him. Many different experiments 
(e.g., 1, 2, 5, 6, 7) have shown that the 
point of subjective equality, or neutral 
point for a series of stimuli, tends to be 
the mid-point of the range of stimuli 
presented to O. Helson (5, 6), with 
his concept of adaptation level, has 
most strongly emphasized this context 
effect. This effect will operate most 
severely when O is presented with a 
judgmental situation which is am- 
biguous, for whatever reason. If O 
has a very definite idea about the 
numerical relations between different 
stimulus magnitudes, then his judg- 
ments should be relatively unaffected 
by the range of stimuli presented to 
him. If, on the other hand, he has 
only a very ambiguous idea about 
these numerical relations, it should be 
possible to get him to accept widely 
different ranges of stimuli as satisfy- 
ing the stated numerical requirements. 

Purpose of experiment.—The specific 
purpose of this experiment, then, was 
to determine to what extent half- 
loudness judgments made with a 
method of constant stimuli can be in- 
fluenced by the context of stimuli 


218 











6 ET Re 





ow ee 








LOUDNESS SCALES 


presented toO. A secondary purpose 
was to attempt to gain some under- 
standing of the process involved when 
O decides that a particular stimulus 
satisfies a given numerical require- 
ment with respect to some other 
stimulus. 


METHOD 


Observers.—Thirty Os were used in the experi- 
ment. They ranged in age from 16 to 25 yr., 
and were all either high school dr college stu- 
dents. Nine were female. None had had any 
previous experience in auditory experiments or 
in any experiment involving psychophysical 
judgments. 

Procedure.—One O at a time was seated in a 
soundproof room and given a headset wired for 
monaural listening. The general procedure is 
best described by the written instructions to 0: 

“The purpose of this experiment is simply 
to find out how different two tones must be in 
loudness for one of them to sound half as loud 
as the other. We are not concerned with a 
physical difference between tones, but only with 
what sounds half as loud to you. Thus there is 
no problem of being right or wrong, but only of 
finding out how these tones sound to you. 

“You will listen to a series of pairs of tones, 
one pair every seven seconds. The first tone 
will always have the same loudness. The second 
tone will vary in loudness from one time to 
another. Your task is to decide whether the 
second tone is more or less than half as loud as 
the first. If it sounds more than half as loud 
as the first, you put a plus (+) on your record 
sheet. If it sounds less than half as loud, you 
put a zero (0) on your record sheet. 

“First you will be given a practice series, so 
that you will be familiar with all the different 
loudnesses you will hear. When you think you 
are ready, we will begin the actual experiment.” 

All tones had a frequency of 1000 cps, and 
both the standard and the variable tones had 
durations of 1 sec., with a silent interval of 1 sec. 
between the two tones of a pair. The tones were 
filtered through a narrow band-pass filter to 
eliminate transients. All timing was done 
electronically with equipment in a room adjacent 
to that of O, who was alone during the ex- 
periment. 

For each O there were six different intensities 
of the variable tone, in 2-db steps, covering a 
range of 10db. These intensities were presented 
in random order for a total of 600 presentations, 
or 100 per intensity. The random series were 
restricted only by the requirements that every 
intensity follow every other one and itself an 





219 


equal number of times, and that all intensities 
be presented equally often for the first and last 
half of the total series. 

The practice series also presented the in- 
tensities in random order, and averaged a total 
of about 30 presentations, with a maximum of 
40. The total time required per 2 was ap- 
proximately 2 hr., and O was given a 10-min. 
rest in the middle of the experiment. 

Experimental conditions—The Os were as- 
signed randomly to three different groups, with 
ten Os per group. All Os had the same intensity 
for the standard tone of 90 db SPL (db re 0.0002 
microbar). The three groups differed only in 
the range of variable intensities presented. For 
one group, the variable intensities ranged from 
55 to 65 db; for another, from 65 to 75 db; and 
for the third, from 75 to 85 db. The highest 
range was selected to include the value expected 
from the present sone scale of loudness (9). 


RESULTS 


Refusals—The instructions to the 
Os did not specifically ask them to 
reject the entire range of presented 
stimuli if it did not include an in- 
tensity which had a loudness half that 
of the standard. On the other hand, 
if Os had a very clear idea of what 
constituted half loudness, many of 
them should have rejected the pre- 
sented range. Only one of the 30 Os 
stated that the presented range did 
not include a half-loudness value. 
That one O had been assigned to the 
65-75 group, and said that all tones 
were more than half as loud. That 0 
was then reassigned to the 55-65 
group, and run as though she had not 
previously been run at all. 

Group results—For each O, an 
intensity value for half loudness was 
computed by linear interpolation on 
the psychophysical function to deter- 
mine an intensity equivalent to half 
of the judgments being more than 
half as loud. Such a value was com- 
puted for the first half of the series, 
for the second half, and for both 
halves together. The means, me- 
dians, and SD’s of these individual 
half-loudness values are presented in 





220 W. R. GARNER 


TABLE 1 


Mean AnD Mepian Intensit1Es or Hair Loupness ror Eacn Group or Os, 
ror Eacu HAatr oF THE JUDGMENTS AND FoR ALL JUDGMENTS 












































—y wwe First Half Second Half All Judgments 
ntensities 
Gre is &) Mean | Median| SD | Mean | Median| SD | Mean | Median| SD 
55-65 60.8 60.3 1.8 60.0 59.6 2.1 60.4 59.8 1.8 
65-75 70.1 70.0 1.4 69.7 69.7 2.0 69.9 69.8 1.6 
75-85 80.2 80.3 3.0 80.1 79.5 2.9 80.2 79.9 2.7 
Table 1. The data show that for be lower on the second half is there- 


each group, the means and medians 
are very close to the mid-point of the 
range of presented variable stimuli. 
Not one of the nine means shown is 
significantly different from the mid- 
point of the presented range. 

Since no mean differed significantly 
from the mid-point, an analysis of 
variance was done with the data, with 
the single cell entry being the indi- 
vidual O’s half-loudness value for 
each half of the judgments, stated 
relative to the mid-point of the range 
of intensities. This procedure gave 
approximately the same mean to each 
group. This analysis showed no sig- 
nificant differences between groups, 
between first and second halves, or be- 
tween combinations of these. The 
fact that each group mean tended to 





" T T : 2 T 
RANGE: 55-65 65-75 75-65 


1] Al 
,1Y 
| 
Jy 
/ 


° 
50 














NUMBER OF O¢ 






































i 


° 70 60 90 





INTENSITY (SPL IN DB) 


Fic. 1. Cumulative frequencies of intensities 
judged to be half as loud as a90 db tone. The 
three different ranges of variable stimulus in- 
tensities are indicated. Each O’s half-loudness 
value is based on the total of 600 judgments. 


fore not statistically reliable. 

Each of the three groups thus shows 
a complete dependence of the half- 
loudness value on the context of 
variable stimuli presented, and no one 
of the groups shows this effect: more 
than another. We have, then, no 
evidence that there is any validity at 
all to this type of judgment. 

Individual differences.—Even 
though the group picture shows an 
essentially complete dependence of 
the half-loudness judgments on the 
context of presented stimuli, indi- 
vidual differences within each group 
are relatively large and statistically 
significant as determined by the 
analysis of variance. The extent of 
these individual differences can be 
seen in Fig. 1, where cumulative fre- 
quency distributions of the mean half- 
loudness values are shown for each 
group. The individual differences for 
the 75-85 range are greater than for 
the other two ranges, but this differ- 
ence is of doubtful statistical signifi- 
cance. These individual differences 
in the 75-85 group are surprisingly 
large in view of the fact that the mean 
for the entire group is not significantly 
different from the mid-point of the 
range. For example, both the highest 
and lowest values shown are actually 
outside of the presented range, and it 
was necessary to extrapolate rather 
than interpolate to determine these 
values. 











INTE NGITY = GECOND WAI C&C 


eo rt & 


co. 


orerms=43 0 0 


— Co ween kee me ee Oe OS 








LOUDNESS SCALES 


















































RANGE (08) 
. 

u 6h @ 75-85 
an o: 65-75 
a 
xr a: 55-65 ra 
oe 4} —4 
2 re.64 ry 
2 A, 
° 
w ? io. 
wo o2 
' . a 

r) 
> Z 4 
> of ° 
” 4 
z-2 rn 
ws ry 
~ 
Zz «4 bd e 

| é 
- i 
8 6 -4@ -2 PS 2 4 6 
INTENSITY-FIRST HALF 
Fic. 2. Intensities judged to be half as loud 


as a 90 db tone for the first and second halves 
of the experimental series. The intensity values 
are in db relative to the mid-point of the range 
of variable stimuli, as indicated. Each plotted 
point represents a single O. 


The consistency of these differences 
is shown in Fig. 2, where the half- 
loudness intensity values for the first 
half of the judgments are plotted 
against those for the second half. 
Data for all three groups have been 
put on a single graph by plotting in- 
tensities relative to the mid-point of 
the range, as was done for the analysis 
of variance. The correlation coeffi- 
cient (which is actually a split-half 
reliability coefficient) between the two 
halves is .84. Thus these differences 
between Os are quite consistent during 
the two halves of the experimental 
series. 

In order to determine how soon in 
the experimental series O established 
the intensity which was half as loud 
for him, Fig. 3 was plotted. In this 
figure, the total number of higher 
judgments for the first 100 stimuli is 
plotted against the total number of 
higher judgments for the last 100 
stimuli. Since there were 400 judg- 
ments intervening between the first 
and last 100, this figure indicates the 
extent to which Os maintained the 


221 


same half-loudness value over a longer 
period of time. (Numbers of higher 
judgments were used, rather than 
actual half-loudness values, because of 
the relatively few judgments at each 
intensity level of the variable stim- 
ulus.) 

The correlation coefficient is, of 
course, lower than the _ split-half 
coefficient, but still rather high (.77). 
Thus it appears that Os established 
early in the series what they con- 
sidered to be half loudness, and then 
maintained this level over the entire 
series with considerable consistency. 
Since Os were given practice trials 
before the actual experimental trials, 
it seems very likely that they es- 


tablished their level during this 
practice series. 
Discussion 


In effect these results indicate that Os 
in these experiments accepted the sug- 
gestion of the context of variable stimuli 
provided by E as to what constitutes 
half loudness almost completely. Cer- 
tainly the results presented in Table 1 
give no indication that the Os used any 
basis for their judgments other than the 









































100 
° RA 
S NGE (08) ¥ 
oe @: 75-85 
= so} °: 65-75 
a a: 55-65 ° 
4 
1 rs.77 
7) 
Y 60 
z 
Lv) 
= *s 
° 
S 40 ie or 
> 
“_ 
r 

x 
208; 
x 
4 
x ° f 

° 

° 20 40 60 80 100 

HIGHER JUOGMENTS - !** 100 

Fic. 3. Number of higher (i.e., more than 


half as loud) judgments for the first and the last 
100 presented variable stimuli. Each plotted 
point represents a single O. 





222 W. R. GARNER 


stimulus context. With results like 
these, one might stil] attempt to deter- 
mine whether one group had a range of 
stimuli closer to the “‘true’”’ half-loudness 
value than other groups. For example, 
there might be less interobserver vari- 
ability for those Os whose presented 
range of stimuli was closer to the “true” 
value, since they would not have as much 
conflict between their own ideas and the 
suggestion of the context. But there 
were no significant differences between 
variabilities. Half-loudness values might 
have been different early in the series, 
before the context effect had time to 
operate fully. But there were no differ- 
ences between the groups in this respect. 
The reliabilities of the judgments might 
have been greater for the group whose 
range of stimuli was near the “true” val- 
ues, but again there were no differences 
between the groupsin this respect. Thus 
by any criterion which seems reasonable, 
it is not possible to surmise that the 
“true” value for half loudness is any 
closer to one of these three ranges than 
to another. 

Despite the very strong evidence from 
this type of reasoning that Os have little 
or no idea what they are doing in this 
type of judgment, and thus respond only 
to the context of stimuli presented, there 
is the paradox that within each group 
the Os differ considerably and that these 
differences are very significant and con- 
sistent from early to late parts of the 
series of judgments. Thus even though 
we have evidence that the Os do not 
know what they are doing (with respect 
to the instructions), they do in fact do 
something quite consistently, and judg- 
ing from the reliabilities, feel quite sure 
that they know what they are doing. 

One or two illustrations will indicate 
the extent of this consistency (and, in a 
sense, resistance to the context effect). 
One O who was presented with the high- 
est range of intensities stated that all but 
six of his first 100 stimuli were not loud 
enough, and in his last 100 judgments, 
stated that all but four were not loud 
enough. The willingness to use so few 
higher judgments certainly indicates a 
strong resistance to the context effect. 


But within this same group, another O 
stated that all but 11 of his first 100 
stimuli were too loud, and that all but 25 
of his last 100 were too loud. At the 
other extreme, one O presented with the 
lowest range of intensities stated that 78 
of his last 100 stimuli were too loud. In 
terms of intensities required for half 
loudness, we have one O with a half- 
loudness value for the entire series of 
85.5 db, and another O with a half- 
loudness value of 57.8 db, and both of 
these Os were consistent in this judgment 
throughout the entire series. 

It seems very likely that Os are quite 
unsure of what constitutes half loudness, 
but when asked to decide what is half 
loudness, pick some value within the 
range of choices provided by E. These 
data suggest that this value is selected 
early in the experiment, and the choice 
may be very heavily influenced by the 
stimuli first heard in the training series. 
Unfortunately, such an extreme result 
was not anticipated, and account was not 
kept of the exact order of presentation in 
the training series. On whatever basis 
the choice is made, however, O clearly 
can maintain consistency of judgment 
for a considerable number of judgments. 
Probably, the real problem to O is simply 
one of establishing an identifying rela- 
tion, and he can learn to identify a 
particular stimulus with a particular 
name. The fact that E uses an identify- 
ing set of numbers which imply certain 
numerical relations, however, does not 
allow him to assume that O makes his 
identification with these relations in 
mind. 

In brief, these experiments have shown 
that Os can exhibit a very high reliability 
for a type of judgment for which there is 
no validity. When an O says that A is 
half of B, we can believe that he will 
continue to say the same thing in this and 
in similar situations. But we cannot 
believe that A is in fact half of B. 

It is interesting to note parenthetically 
that if any one of the three groups used 
here had been used as the only group in a 
fractionation experiment, we would have 
had good reason to assume validity of the 
judgments by many of the criteria often 

















LOUDNESS SCALES 223 


used. Reliability is good, and it is often 
used as an argument for validity. In one 
recent experiment (8), for example, the 
method of constants was intentionally 
selected for the experiment because the 
evidence indicated that it gave the high- 
est reliability. The large interobserver 
differences themselves suggest validity, 
since it is hard to believe that Os can be 
so consistent in disagreeing with each 
other if they do not know what they are 
doing. : 

The nature of validity —In many ways, 
the validity problem in psychological 
scaling comes from a misunderstanding 
of the nature of validity, and some of 
this misunderstanding derives from 
operationism in its more extreme form. 
If we state, as has often been stated, that 
a concept is nothing more than the 
operations from which it has been de- 
rived, then, of course, a loudness scale 
can be constructed from data such as 
those presented in this experiment, and 
the scale is valid for-predicting group 
responses for this particular experiment 
—and probably for no others. This is 
equivalent to saying that a test is valid 
for predicting scores on the test, and 
when we say this there is no longer any 
meaning to the term “validity.” 

Science progresses by means of gen- 
eralizations, because without generaliza- 
tions we have as many concepts as there 
are facts or data. A concept should 
serve the function of summarizing or 
bringing together at least two, and pre- 
ferably many, discrete sets of facts or 
relations, and unless the concept does 
this, there is little point to it. Thus to 
argue that a psychological scale (a 
concept) is valid because it is based on 
numerical judgments of Os makes a farce 
out of the concept of a sensory scale. 
When we attempt to determine the 
nature of a sensory scale, we are at- 
tempting to measure some invariant of 
the sensory process (usually an invariant 
with respect to magnitude relations). 
To use the numerical responses or judg- 
ments of an O to determine the nature of 
a sensory scale is legitimate only if we can 
demonstrate the validity of the numer- 


ical responses as proper indicators of the 
sensory process we are trying to measure. 

Direct validation of the numerical re- 
sponses is impossible, because we have no 
independent measure of the sensory 
process itself. This fact in itself has 
often been used as an excuse for not 
attempting a validation. There are, 
however, other ways of getting valida- 
tion. Validation can be obtained by 
using converging operations to arrive at 
a single construct or concept. If two or 
more independent sets of data, involving 
basically different indicators of the 
nature of the sensory process, lead to the 
same sensory scale, then we have a form 
of validation. Such validation is prob- 
ably the only meaningful kind in this and 
in other areas of psychology. All valid 
concepts are formed from independent 
observations and operations which allow 
convergence to the single concept, al- 
though most techniques of validation 
used in psychology (with the obvious 
exception of factor analysis) do not make 
this process obvious. 

It is not necessary, in using converging 
operations to arrive at a single sensory 
scale, that we assume that the numerical 
responses are themselves valid. In fact, 
the evidence of the invalidity of such 
responses makes it seem unlikely that 
they can ever be used with much assur- 
ance. For these reasons, it has been 
argued in a recent paper (4) that sensory 
scales of magnitude, if they are valid 
at all, must be constructed by using this 
principle of converging operations, but 
with operations for which the assumption 
of the validity of the numerical response 
is not necessary. 


SUMMARY 


In these experiments, Os were required to 
make half-loudness judgments with a méthod of 
constant stimuli. Each of three groups of Os 
was given a different nonoverlapping range of 
variable stimuli to be judged with respect to a 
standard stimulus. The major results and 
conclusions are: 


1. For each group the mean intensity required 
for half loudness was not significantly different 
from the mid-point of the range of variable 
stimuli. ‘Thus the judgments were made almost 





224 W. R. GARNER 


completely with respect to the context of 
presented stimuli. 

2. Within each group there were large in- 
dividual differences which were established early 
in the judgment series and were maintained over 
the entire series of judgments. 

3. It is concluded that such judgments are 
reliable but not valid for purposes of loudness 
scale construction. 

4. It is further pointed out that Os in general 
do not seem able to describe sensory magnitudes 
with a scale of numbers. Thus sensory scale 
construction must depend on the use of con- 
verging operations which do not require the 
assumption of the valid use of number scales 
by Os. 


REFERENCES 


1. Doucuty, J. M. The effect of psycho- 
physical method and context on pitch and 
loudness functions. J. exp. Psychol., 
1949, 39, 729-745. 

2. Doucuty, J. M., & Garner, W. R. Pitch 
characteristics of short tones. II. Pitch 


as a function of tonal duration. 
Psychol., 1948, 38, 478-494. 

. Garner, W. R. Some statistical aspects of 
half-loudness judgments. J. acoust. Soc. 
Amer., 1952, 24, 153-157. 

. Garner, W.R. A technique and a scale for 
loudness measurement. J. acoust. Soc. 
Amer., 1954, 26, 73-88. 

. Herson, H. Adaptation-level as a frame of 
reference for prediction of psychophysical 
data. Amer. J. Psychol., 1947, 60, 1-29. 

. Hetson, H. Adaptation-level as a basis for 
a quantitative theory of frames of refer- 
ence. Psychol. Rev., 1948, 55, 297-313. 

- Koester, T., & Scooenretp, W. N. The 
effect of context upon judgments of pitch 
differences. J. exp. Psychol., 1946, 36, 
417-430. 

. Rosinson, D. W. The relation between the 
sone and phon scales of loudness. Acus- 
tica, 1953, 3, 344-358. 

. Stevens, S.S.,& Davis,H. Hearing. New 
York: Wiley, 1938. 


J. exp. 


(Received for early publication 
June 21, 1954) 








